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EDITORIAL 


The division of the Journal of Abnormal 
and Social Psychology into two journals, a 
long debated and worrisome move, has en- 
countered few, if any, of the anticipated dif- 
ficultjes. Perhaps postdecision dissonance has 
operated to make the editors too selective in 
evaluating the evidence rel&ted to the separa- 
tion of the journal’. Nevertheless there have 
been virtually no problems about jurisdiction 
nor about any scarcity of good manuscripts 
for either journdl. Moreover, there are ex- 
cellent prospects in the future for a more 
adequate page allotment for papers from the 
three areas of social, personality, and abnor- 
mal, and strong indications that both of the 
new journals will more than pay for themselves 
Jin terms of subscriptions. And the feedbatk 
from our authors and readers has been almost 
uniformly positive. Less important, but not 
altogether irrelevant, is the fact that your 
editor and associate editor, though still over- 
loaded, have occasionally been able to come 
up for air. 

The issue of Specialization and prolifera- 
tion of subareas in psychology, however, is 
less easy of assessment. But the facts are that 
specialization in this broad domain had al- 
ready oecurred and the split of the journals 
was a belated recognition of the existing state 
of affairs. 

The second year of the new journal starts 
more hopefully than its initial year in that we 
should be able to reduce our publication lag 
substantially within the next 12 months. This 
is of course 4 rash statement in that social 
prophesying can be self-defeating as well as 
self-fulfilling. The mere announcement that 
authors can expect quicker publication is 
likely to mean that the inpfit of new manu- 
scripts will jump. @n the other hand, after 
long and heated discussion, the Board of Di- 
rectors has recognized our page needs and its 
action has been approved by the Council of 
Representatives. In addition, I will receive 
manuscripts only through December of this 

. year (since my term of office expires in 1967) 
< and hence must cut publication’ lag to 12 
months to process all papers accepted during 
my term. The new editor who will be chosen 
! 


this year will receive manuscripts beginning 
January 1, 1967. 1 
In the past, publication policies of the APA 
have tended to lag behind the exponential 
growth of research and experimentation. De- 
cisions about the splitting of the old JASP 
and granting of a more adequate page allot- 
ment were reached slowly and with some- 
thing less than deliberate speed. This con- 
servatism was due in part to the lack of 
thorough research on the complex issues in- 
volved to provide information for policy 
"guidelines. Within the past few years, how- 
ever, the Project on Scientific Information 
and Exchange in, Psychology, supported by 
NSF; has been gathering data on various 
aspects of dissemination of information so 
that future publication policies can have the 
benefit of some operational research, The 
studies conducted by this project show, for 
example, that the immediate communication 
needs of researchers are not met through the 
traditional journals. A communication net- 
work has grown up involving the circulation 
of preprints, reading of papers at meetings, 
and informal contacts and correspondence for 
people in a given area of research. This 
would suggest that some APA journals may 
wish to give considerable space to 2-page re- 
ports of research as is done in Psychonomic 
Science. Brief reports of this type could be 
published within 3 months of their receipt. 
Such a summary report could not be blown 
up into a more detailed paper without addi- 
tional research for later publication since this 
would constitute duplicate publication. Au- 
thors could choose between brief and quick 
reports or full and later publication. This may 
not be the best way of meeting the problem 
as delineated by the Scientific Information 
Project; it is merely offered as an illustration 
of one way of dealing with the issue. 
Whether or not the Journal moves to ac- 
cept brief reports for immediate publication, 
it should continue to provide adequate space 
for longer papers. Scientific journals have 
other functions than a reader's digest to meet 
the immediate communication needs of re- 
searchers. A major function is to provide the 
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substantive and methodological materials for 
a field of knowledge on the basis of which 
textbooks are written, theoretical and integra- 
tive articles are prepared, and undergraduate 
and graduate courses are taught. Though gen- 
eral purpose reading has declined, many peo- 
ple utilize the periodicals in their field for 
such specific purposes as working up lectures 
and preparing materials for classroom discus- 
sions and writing articles or books to sys- 
tematize findings or to present a new theoreti- 
cal framework. To reduce journals to an 
archive of brief research reports would seri- 
ously impair their usefulness as a major 
scholarly source of the profession. 


LJ 
In the remaining year of my editorship, I 


plan no major change in policy. The purpose 
will remain to make the joufnal as representa- 
tive of the significant, or potentially signifi- 
cant, research in the wide domain of per- 
sonality and social psychology as is possible. 
In attempting a wide coverage of the varied 
interests of APA members, we will continue to 
publish studies ranging from physiological 
correlates of social behavior to field research 
on psychological aspects of social structure. 
Our journal in consequence will continue to 
lack theoretical and methodological focus. Its 
objective is to help advance knowledge in per- 
haps the broadest area of psychology and, as 
an official journal of psychologists in this 
field, the definition of central issues must come 
from the research and not from the personal 
values of the editor. Preference will be given 
to papers which move us ahead through their 
definitive contributions in well-trodden paths 
of investigation or to less definitive articles 
which report new developments. Another as- 
pect of editorial policy needs reiteration. Both 
descriptive and hypothesis-testing studies are 
acceptable. But the descriptive study should 
go either wide or deep and should generally 
have some sampling design. The hypothesis- 


testing study should be clear, both with re- 
spect to experimental manipulations and 
theoretical rationale. Finally, in spite of con- 
tinuing overload, the journal receives far too 
few papers of the following types: (a) theo- 
retical pieces of a fairly formal character (not 
necessarily mathematica] but more systematic 
than the presentation of an interesting fdea), 
(5) research whig attempts to test two al- 
ternative theoretical. formulations, and (c) 
field studies of social process and social 
structure. e 

Because of the breadth of'interests covered 
by the Journal and because of the heavy 
manuscript input, I have had to rely heavily 
not only upon our very able associate editor; 
and board members, but. also on many con- 
sultants who have been most generous in con- 
tributing their time and scholasship in ye- 
viewing manuscripts. I am most gratefül to 
the following consultants who have reviewed 


two or more manuscripts during the past year: 
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Within-sex friendship was assessed by sociometric nomination in college dor- 
mitories and, personality by both peer reputation and self-description. Among 
approximately 93 pairs of female friends end 90 pairs of male friends reputa- 
tions were similar and self-descriptions were essentially unrelated. The sig- 
nificant cyoss-trait correlations for friends’ reputations were also interpreted as 
indicating reputational similarity. Among a subsample of clgser friends, the 
magnitude of reputation correlations was higher whereas self-descriptions re- 


‘mained unrelated. The reputationseof nonreciprocated 1st-choice pairs were 


unrelated. The results could be interpreted as indicating that friendship produces 
similarity without altering self-descriptions. The preferred interpretation is 
that the reputational similarity represents judgment error on the part of peers. 
Friends’ common navigation through time and space and their true similarity 
on attitude, interest, value, skill, and socioeconomic dimensions is mistakenly 
generalized by the raters to the personality trait dimensions used in the present 


study. 


e 

The possibility of describing the relation 
between friends’ personalities by a single gen- 
eral relationship has been a persistently in- 
triguing one. Indeed, so much so, that the two 
simplest and most general relations between 
friends, though contradictory, frequently ap- 
pear as aphorisms reflecting our accumulated 
cultural wisdom on interpersonal relations: 
“birds of a feather flock together" and “op- 
posites attract.” These two types of relations 
ean be designated similarity and contrast. 
Winch (1952), in speaking of complementary 
relations between the traits of married cou- 
ples, has presented the most influential hy- 


iThis research was supported by NIH Grant 
M 1544 to Northwestern University, Donald T. 
Campbell, principal investigator. We thank Donald 
C. Butler and Barry E. Collins for comments on 


an early draft. 


? Deceased. T 
3 Because different trait labels can have similar 


meaning, cross-trait relations of either type are also 
possible. For example, assume that to scheme is one 
manifestation of hostility. If A is scheming and B 
is hostile, this would be a similarity cross-trait 
relation. On the other hand, if A is scheming and 
B is unhostile, the cross-trait relation would be of 
the contrast type. 


] 


potheses for marriage research within the last 
decade (Tharp, 1963). He speaks of within- 
trait complementarity (A is independent and 
B is dependent) as well as cross-trait comple- 
mentarity (A is deferent and B is hostile), 
and views both as antagonistic or opposite 
to a similarity relation, 

Interestingly, researchers who have searched 
for a single ubiquitous relation between the 
personalities of friends, with perhaps the 
single exception of the work of Winch and his 
co-workers on data from 25 undergraduate 
married couples (Winch, 1955a, 1955b; 
Winch, Ktsanes, & Ktsanes, 1954, 1955), have 
found either no relation (e.g., Bowerman & 
Day, 1956; Hoffman, 1958; Izard, 1963; 
Katz, Glucksberg, & Krauss, 1960; Reilly, 
Commins, & Stefic, 1960) or trends in the 
direction of similarity (e.g., Banta & Hether- 
ington, 1963; Beier, Rossi, & Garfield, 1961; 
Corsini, 1956; Day, 1961; Izard, 1960; 
Kelly, 1955; Murstein, 1961; Newcomb, 
1956). 

If, as the bulk of previous research sug- 
gests, similarity is the most reliable relation 
between the personalities of friends, the ques- 
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tion of how this relation should be interpreted 
still remains. Selections of friends based on 
true similarity can be predicted by both bal- 
ance and reinforcement theories of interper- 
sonal interaction. It is also possible, however, 
that data indicating similarity do not in fact 
reflect balance or reinforcement processes at 
all but rather have arisen through judgment 
error. To the extent that true similarity on 
some single general personality trait (such as 

` the extent to which one is well liked) gen- 
eralizes to other specific traits, ubiquitous 
similarity would be found. While such simi- 
larity is not in itself artifactual, it leads to 
misinterpretation. Similarity is mistakenly ato 
tributed to the relation between the more 
specific traits of friends and therefore possi- 
ble specific contrast relatibns are occluded. 
This source of error and a possible statistical 
cure are discussed in detail in the Method 
section. 

The present study, while dealing solely with 
same-sexed friendships, adds to knowledge on 
friendship choice by including a number of 
features which either singly or in combina- 
tion are absent from previous studies: it em- 
ploys statistical procedures designed to elimi- 
nate the artifacts that commonly plague 
research requiring multiple judgments on mul- 
tiple dimensions; the sample of friendship 
pairs is large; reputational definitions of per- 
sonality are included as well as the more usual 
self-description definition; both cross-trait 
and within-trait relations are examined. To 
add further to the interpretability of the major 
analyses, subanalyses are performed to exam- 
ine the effects of closeness of friendship and 
nonreciprocation of friendship. 


METHOD 
Subjects 


In connection with another problem (Cam 
Miller, Lubetsky, & O'Connell, 1964) d ea pn 
lected from 237 females and 225 males from 19 
residential groups. The groups primarily included 
Northwestern University. fraternities, Sororities, and 
dormitories, but also included several groups from a 
Catholic university and a teachers college. All stu- 
dents who had resided at least 1 year in the living 
unit were allowed to volunteer, Following the last 
of three data collecting sessions, each subject was 
paid $5, 


Instruments and Data Collection 


The complete measurement techniques and pro- 
cedures are described in detail elsewhere (Campbell 
et al., 1964). While only the measurement of person- 
ality and the determination of friendship choices are 
relevant, to give some idea of the context in which 
the data for this study were obtaiged, the entire 
procedure is briefly presehted in chronologigal se- 
quence. In the first data collection session subjects 
used a 9-point scale with described ends and mid- 
point to rate each of Their fellows as well as them- 
selves on 27 personality traits. On completion, sub- 
jects indicated for all high ratings whether the per- 
son rated was aware that hes possessed the trait. In 
the second session subjects rated photographs of 30 
persons of varying age and sex on the same set of 
personality traits. The third session asked for back- 
ground, personal history, socioeconomic information, 
a second self-rating, ratings of family members, an 
estimate of average rating feceived from others on : 
each trait, a rating of the desirability of the 27 traits, 
and an ordered list of one’s five closest frignds 
within the residential group. The three data ¢ollec- 
tion sessions, scheduled approximately 1 week apart, 
took about 5 hours. The research was presented to 
the subjects as a collection of normative personality 
data from college subjects and as a study of the 
ability to judge personality from photographs, There 
was no reason to suspectathat any subject doubted 
these facades. 

With the exception of one ®roup which -made 
their ratings in their individual rooms, ratings were 
made in a common room in the presence of an | 
experimenter. The instructions encouraged the sub- 
jects to use the entire range. of the 9-point scale and 


'to make their ratings in terms of the norms for their 


own living group. Printed definitions and descrip- 
tive comparisons of the traits were provided with 
each set of materials and read by the experimenter 
at the start of the first session. Additional clarifica- 
tion was provided whenever necessary. All ratings 
were made across traits, one person at a time. In 
some dormitory groups where a subject occasionally 
insisted that he did not know a particular person 
and could not rate him, he was pérmitted to skip 
that person. 

As indicated, the present report is concerned solely 


with the sociometric choices and their relation to two = 


types of scores on the personality traits, reputation 
scores, and self-descfiption scores. 
. 


Friendship Pairs 


The major analysis is based on a pairing that 
maximizes the number of reciprocal friendship pairs. 


All pairs consist of persons who chose each other | — 


somewhere in their ordered lists of five closest 
friends within the living group. Where possible, 
persons whose choices of one another were mutually 
high on their lists were paired. However, lower 
ranked choices were matched where necessary in 
order to maximize the number of pairs. Those per- 
sons for whom there was no reciprocal choice and 


lose persons who either omitted the sociometric 
sk, only listed. friends outside the living group, or 
ohly listed persons who had lived in the residential 
i üfüt,less than 1 year were eliminated. One hundred 

pairs of females and 89 pairs of males for whom 
“there "were reputation scores were constructed. Of 
“äi, these, the number of pairs for whom there were 
self-description scores based on two self-descriptions 
was Slightly reduced: 93 females and 87 males. 
_) Direct’information on the reliability and validity of 
ithe friendship. pairings is uwafailable. While the in- 
“structions to the subjects required them to list their 
actual rather than preferred friends, it is possible 
that portions of some lists represent wishful fantasy. 
However, the use of ‘reciprocal choice pairs and the 
"large ` proportion % the total population actually 
cluded in the pairs would also seem to attest 
trongly to the validity of the subjects’ lists, If most 
hoices represented wishful fantasy, the most likely 
outcome would be a listing by each subject of the 
most ‘popular residents, Such overchoice of the 
"stars". by all would result in inability to include 
‘mest subjects in a reciprocal pair, P 
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rsonality Definitions 
While the. simplest approach to the reputational 
definition ‘of personality would have been the use of 
`. the mean rating received by an individual on a given 
, trait, the likelihood of¢predictable biases in these 
Scores led to the computation of “double-standard- 
ized” scores. When an individual is requested to 
make multiple. trait ratings, a typical finding is a 
first-order. factor of “favorability” or “like-dislike.” 
5." This halo or extremely potent trait of being “well 
liked” (or “hated”) largely determines the rating a 
person rgceives on every other trait. Thus, instead of 
‘obtaining ratings on 27 different traits, one could 
easily obtain in large part 27 repetitions of a single 
rating on the like-dislike dimension. In other words 
the .scores from all 27 traits could be summed and 
this ‘total would be’ similar to a 27-item attitude 
scale designed to. measure social approval or liking. 
This was in fact done and the mean entered as the 
. etwenty-eighth score entitled “general unfavorability.” 
4, The high reliability of this scale total supports the 
“of motion that the entire profile of scores on the 27 
Araits ‘would be unfavorable for a disliked person 
and favorable for'a liked person. 
s If the persons who choose one another as friends 
"are more similar in the extenf to which they are 
liked or disliked by the members of their residential 
„unit as compared to randomly paired persons, simi- 
larity between the reputations of friends on all 
* traits is automatically built into the raw data. No 
portunity would exist for finding contrast rela- 
tions between the personalities of friends. This situ- 
ation is illustrated in Figure 1a. Since it is postulated 
that each trait contributes equally to the general 
unfavorability score, a change in likivg for an indi- 
vidual would not change the pattern among the 
traits; it would merely move the entire profile up or 
down the scale, Illustrative average reputational 
ratings on five traits for each of three pairs of 
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Ib. STANDARDIZED REPUTATIONS 


Fic. 1. Fictitious reputation scores for three pairs 
of friends (A,A'; B,B'; and C,C') which illustrate 
the exaggeration of friends’ similarities (Figure 1a) 
when reputation is not adjusted for average favor- 
ability by standardization (Figure 1b). (See text 
for explanation.) 


friends are presented in Figure 1a. It is immediately 
apparent that Pair C is disliked. They are generally 
judged as incapable, insincere, conceited, greedy, 
and unscrupulous, This difference between the gen- 
eral approval of Pair C and other pairs is reflected 
in the differences between the average rating on all 
traits for each of the three pairs. Both members of 
Pair C average 8 whereas both members of Pair A 
average 2, and both members of Pair B average 5. 
If the correlations between these three pairs of 
friends were computed for any of the five traits, the 
correlation would inevitably be positive. This is true 
even though for all three pairs, when one member 
tends to be high on a given trait (relative to the 
average rating of the pair across traits), the other 
tends to be low. 

For this reason, each person’s average reputation 
on a given trait was converted into a score which 
represented his general tendency to be high on this 
trait to a greater extent than on other unfavorable 
traits. This was achieved by the “first standardiza- 
tion.” All of a person’s mean reputation values in the 
unfavorable direction were averaged (after reversing 
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four favorably worded traits) and the trait-reputa- 
tion means then converted into z Scores representing 
deviations above and below this general tendency to 
receive unfavorable ratings from others. 

The result of such a transformation is illustrated 


in Figure 1b, where the scores from Figure la have | 


been replotted holding constant the average favor- 
ability of each individual’s scores. Note that in 
Figure 1b, where the ratings for each trait are ex- 
pressed as deviations from the individual’s average, 
the pattern suggested is quite different. Analysis of 
the standardized scores (Figure 1b) stresses the gross 
differences in the displacements of entire profiles. 
Stated in another way, similarity between friends on 
the like-dislike factor will obscure the more subtle 
correlations which would remain in the residual cor- 
relation matrix once this factor had been removed. 
Of course, in the actual data, this is not the only 
factor present among the traits. Even after its re- 
moval, the traits show substantial intercorrelation. 
However, factor analysis of the raw reputation 
scores and the standardized reputation scores sup- 
ports our arguments. While a like-dislike factor is 
the first factor extracted from the raw scores, no 
comparable factor is found among the standardized 
Scores. 

Essentially the same rationale eliminates the use of 
each person's mean selí-description on each trait and 
justifies the "first standardization" for self-descrip- 
tion across the 27 traits. In the case of self-descrip- 
tion, some potent general factor such as self-esteem, 
defensiveness, or some other general response bias 
might largely contribute to the overall level of fa- 
vorability of the self-ratings. Again, if for some 
reason friends tend to be similar on any one general 
factor, similarity between friends' self-descriptions is 
built into all the specific trait ratings. The possibil- 
ity of discovering more subtle contrast relations 
between friends could only be discovered after 
removal of any more general factor affecting the 
favorability of self-description. 

Previous research and other general considerations 
led to the expectation that correlations between 
friends’ personalities would be very low even where 
significant. Since reputations were used as one defini- 
tion of personality and since residential groups in 
which people know one another well enough to make 
trait ratings rarely have more than 35 members, it 
was clearly necessary to pool data from many groups. 
The range of groups used made cultural differences 
likely, Idiosyncratic habits of language usage within 
the different residential units or different norms 
for the favorability of different traits could once 
again lead to artifactual similarity of friends’ trait 
ratings in that friendship pairs were always selected 
from within rather than across residential units. Such 
differences in norms for ascription or habits of 
language use clearly exist for males and females. A 
trait like "aggressive" would. more readily be as- 
cribed to males than to females. The same kinds of 


4 We thank Barry E. Collins for aid in performing 
these analyses, 
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differences, however, may occur between fraternity 
houses. In one, where great emphasis is placed on 
competitive sports, “aggressive” may be a “good” 
trait, In another, involved in some intrafraternal or 
community project, “nonaggression” may be.a “good” 
trait. If so, when pairs from residential units are 
included in a single correlational analysis, the corre- 
lation would emerge positive. J 

To avoid this irrelevant source of an apparent 
similarity correlation, the scores of each person on 
each trait have also beed standardized in the more 
usual fashion around each residential group's own 
mean, This second standardization was also per- 
formed on both the reputation as well as the self- 
description scores. It is regarded as achieving essen- 
tially the same results as if on each trait the cor- 
relation between friends was computed separately 
for each residential unit and then all were averaged. 

Thus, for both reputations and self-descriptions, an 
individual's scores were converted into z scores devi- 
ating from the 0 point assigned to his average score 
across traits, After this first standardization across 
traits and within persons, a second standardization 
was performed within a trait and across persons 
within the particular living group. That is, the z 
scores of each person on each trait were restandard- 
ized around each residential unit’s own mean for 
that trait, All major analyses were performed on 
these double-standardized scores.5 To check on the 
validity of the preceding arguments, however, some 
analyses were also performed om the simple raw 
scores, These further substantiate our arguments and 
are discussed with the results, Nevertheless, while 
curing real problems, the double-standardization pro- 
cedure does not correct for all possible types of 
judgment errors. 

It makes little sense to look for relations "between 
the personalities of friends unless the measures of 
personality are satisfactory. The reputation reliabili- 
ties of the double-standardized scores were computed 
using the split-half technique, which required that 
the double-standardization be carried out separately 
for each half of the data as well as for the pooled 
data. The ratings by odd-numbered and even-num- 
bered judges were double-standardized separately. 
For the self-description reliabilities, the two self- 


5 Ideally, 
would have been carried out, not on the data from 
all the subjects, but"rather, only on the data from 
those subjects who were par* of a friendship pair. 
However, other purpose had required use of all the 
cases for double standardizations of reputation and 
self-description scores. In view of the expense of 
the double-standardization procedure, because the 
procedure for constructing friendship pairs allowed 
inclusion of almost all the persons for whom reputa- 
tion and self-description scores were available, and 
because inspection of the mean reputations and selí- 
descriptions of the few persons not included in 
friendship pairs revealed no systematic deviation 
from those who were included, the double stand- 
ardization was not redone, 


the double-standardization procedure" 


* 


- ard 


descriptions obtained on separate occasions were 


«double standardized separately. 


After Spearman-Brown correction of the reputa- 


~ tion reliabilities, they range from .47 to .95 and 40 


to .93 for females and males, respectively, with 
median reliabilities of .79 and .70. The “multitrait- 
multimethod matrix” considerations suggested by 
Campbell affd Fiske (1959) stress the importance of 
“distriminant reliability,” particularly on those traits 


. with low reliability. That is, is the reliability for a 
trait higher than the cOrrelations between that 


trait and all other supposedly distinct traits? While in 


general they are, there are three traits for men and 


one for women which do not meet this criterion: 
prying, unapprecitive, and good judge of person- 
ality for men and insincere for women. And for all 
of these except unappyeciative, only one of the 
between-trait correlations exceeds the reliability € 
coefficient in magnitude. The corrected self-descrip- 
tion reliabilities range from .46 to .84 and .54 to .85 
for females and males, respectively, with median 
rgliabilities of .71 for both. In general they are about 
as satisfactory as the reputation reliabilities. e 
The intercorrelation of self-description and repu- 
tation scores might be considered validity data, 
though for our purposes it is difficult to say which 
should be the criterion for the other. The magni- 
tudes of these correlations are typical of the validity 
data for personality mgasures in general when such 
measures represent independent assessments. In the 
extensive sampling presented by Campbell and Fiske 
(1959), -correlations of .50 are unusually high and 
values of .30 typical of “successful” measurement 
efforts. Among the correlations between the double- 
standardized self-description and reputation scores, 
for both males and females, 11 out of the 28, exceed 
30. Aff but 2 of the 22 values of 30 or greater 
meet the discriminant validity criterion of being 
higher than any of their heterotrait-heteromethod 
values (Campbell & Fiske, 1959). Considering repu- 
tations and self-descriptions as two different meth- 
ods, this criterion compares correlations between 
the same trait using the two methods, to the corre- 
lations between scores on one trait and scores on 
other traits "using the other method. Using the 


‘standard two-tailed test of significance, which is a 


less stringent criterion, 20 of the 28 correlations are 
significant beyond the .01 level for males and 22 are 
significant for femalés, These eutcomes, though more 
favorable, are not substantially different from those 
obtained with the raw means. 


ê Both parts of the double-standardization pro- 
cedure reduce computed reliability to a lower value 
than that obtained using raw means as scores. The 
first standardization reduced the range of scores by 
equating. the average level of unfavorability. The 
second standardization further redueed the range of 
scores by equating the means and variances across 
all residential 'groups. Thus, reliabilities computed 
from raw reputation scores (simple mean reputation) 
average about .10 higher. 
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RESULTS 
Trait Correlation Analyses (r) 


Within-trait correlations. The first section 
of Table 1 presents the diagonal values of 
four product-moment correlation matrices 
based on the double-standardized reputation 
and self-description scores of both male and 
female friendship pairs.’ These correlations 
represent the major analysis testing the rela- 
tion between the personalities of friends. 
Where significantly positive they support a 
similarity relation. Contrariwise, where nega- 
tive, they support contrast. 

For reputation correlations, 25 of 28 and 24 
of 28 are in the similarity direction for fe- 
males and males, respectively. Ten are sig- 
nificantly positive for the females and 6 for 
the males. No negative correlations are sig- 
nificant. By a sign test, these results would 
be highly significant, although one could 
scarcely claim independence for the correla- 
tions, Similarly, there are far too many signifi- 
cant correlations to consider them as chance 
fluctuations around an average correlation of 
0. Averaging the separate correlations of fe- 
males and males on each trait by converting 
them to z scores, 15 of 28 correlations (all in 
the similarity direction) exceed the 5% level: 
anxious, bossy, uncapable, complaining, con- 
ceited, critical, masculine-effeminate, gullible, 
hostile, uninfluential, obsequious, unorderly, 


7 The intraclass correlation would have been the 
most appropriate statistic for examining personality 
relations within single traits for pairs of friends. 
However, for correlations between different traits, 
there is some argument for using the standard 
product-moment correlation. When different traits 
are used, the entries for computing the correlation 
are not symmetrical as in the case where one is 
correlating pairs on a single dimension (ie. the 
heights of married couples). If the two trait means 
differ, the effect is to lower the intraclass correlation. 
Since cross-trait relations were initially judged to be 
as relevant as within-trait relations, this argued 
against using the intraclass correlation. While it was 
expected that double standardizing the scores would 
reduce differences in trait means, it was difficult 
to estimate in advance the magnitudes and effects 
of residual differences. Since all measurements are 
entered twice in computing the intraclass correlation, 
the number of degrees of freedom is doubled. Thus, 
the decision to use the Pearson product-moment cor- 
relation results in a more conservative estimate of 
association. 
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TABLE 1 
CORRELATIONS BETWEEN FRIENDS’ DOUBLE-STANDARDIZED TRAIT SCORES 

5 First-choice 

Mixed-choice reciprocal pairs LAE nonreciprocal 

Trait Reputations Self-descriptions Reputatipns Reputations 
Females Males Females Males Fem: Males Males 

2100)* (N =89) (N —93 (N =87) (N =53)> | (N44) (N =23) 
e d» ‘ (2) ? f (3) ; : (4) ? ‘ (5) (6 (7) 
Anxious 28% 00 —06 09 15 03 —01 
Bossy 18 21* 13 —01 36" —05 —08 
(Un)capable 26 09 —04 —07 34* 17> —03 
Complaining 29% 17 00 16 37" 25 —13 
Conceited 21* 10 05 —11 15 01 —28 
Critical ; 26** 13 -1 —10 42* |? 14 —29 
Dependent 12 06 05 18 —23 —34 
Masculine 14 34e —03 20 20 36* —30 
Greedy 08 02 19 07 -—11 ^19 04 
Gullible 21*, 15 07 02 37" 07 -32 
Hostile 22* 21* 16 —04 18 25 06 
(Un)influential 36"* 34e 16 29% 55e "30* —13 
Insincere 21* 03 —28** 17 17 —08 13 
Obsequious 16 22* —06 01 09 0t 02 
(Un)orderly 18 14 23* —20 19 27 30 
Prying 05 —02 03 04 25 04 16 
Scheming 18 —15 =li 02 26 —02 —15 
Secretive 12 —04 15 —06 16 11 23 
Sex interest. 07 30** 32e 03 17 45v —i1 
Stingy 00 —06 03 —16 —02 —01 37 
Strict 08 09 03 —04 34* 29* 33 
Stubborn —07 15 00 04 08 11 —04 
Suspicious 14 16 15 —13 12 40** 07 
Touchy 17 08 —16 —04 —13 —03 —24 
Unappreciative —03 13 04 —15 04 26 40 
Unscrupulous —02 05 03 27* 17 —01 38 
(Bad) judge of personality 32*& 16 14 20 34* 33* —28 
General unfavorability 07 04 00 05 06 15 2 
Average r 16 11 04 02 20 14 02 
aN = number of friendship pairs. 
*p «.05, 


sex interest, suspicious, and bad judge of per- 
sonality. 

The self-description correlations are less 
consistent. For females and males, respec- 
tively, 20 of 28 and 16 of 28 are in the simi- 
larity direction, However, for each sex only 2 
are significantly positive, In addition, insin- 
cere is significantly negative for females, 
Averaging the correlations of females and 
males, only 4 traits exceed the 5% level: un- 
influential, sex interest, unscrupulous, and bad 
judge of personality. 

These results indicate a stable similarity 
relation between friends’ reputations, The 
evidence for a relation between the self-de- 
scriptions of friends is far more tenuous, For 
self-description scores, the average correla- 
tion across traits is .04 and .02 for females 


and males in comparison to .16 and .11 for 
reputation scores, The conservative assump- 
tion of independence between the 28 traits 
would lead to the expectation that approxi- 
mately 1.5 correlations would exceed the 5% 
level in the positive direction by chance. That 
four self-description correlations do exceed the 
5% level in the similarity direction does not 
represent a substantial departure from chance. 
However, to conclude no relation is overly 
conservative, The 28 traits are not 28 sepa- 
rate dimensions, 2 correlations are signifi- 
cantly positive beyond the 1% level, and the 
Pearson 7 gives a conservative estimate of 
significance. ` 

Cross-trait correlations. Further support for 
similarity or contrast relations was evaluated 
by examining the off-diagonal correlations. 


ia a 


| 
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Table 2 presents those trait pairs for which 
the analogous correlations on symmetrical 
sides of the diagonal were significant beyond 
the 5% level. This conservative criterion 
largely eliminates the burden of interpreting 
sampling error. Note that the trait labels in 
Table 2 have been reversed in polarity when- 
ever necessary so that all listings are positive 
associations. Inspectiom suggests similarity on 
a dominant-submissive, effective-ineffective, 
dimension for reputations, For self-descrip- 
tions, no cross-trait correlations are consist- 
ently significant for females and only the 
intercorrelation of touchy and scrupulous is 
consistently significant for males. g 
The cross-trait relations are judged to be 
invariably in the* similarity direction. In all 
cases the sign of the association between 
fraits is the samé as the sign of the correla- 
tion between the same traits within individ- 
uals, This is true regardless of whether indi- 
viduals’ reputation scores or self-description 
Scores are used for comparison purposes. 


Subanalyses : 


Profile correlation analyses (Q) of first- 
choice reciprocal pairs. Both the within-trait 


TABLE 2 


CONBISTENTLY SIGNIFICANT POSITIVE CROss-TRAIT 
RELATIONS FOR FRIENDS 


Females 


Males 


Reputation scores 


Reputation scores 


-Capable: influential 


Capable: goqd judge of 
personality 
Capable: critical 
Complaining: gullible 
Complaining: influential 
Complaining: stubborn 
Critical: influential 
Critical: good judge of 
personality a 
Hostile: capable 
Influential: stubborn 
Influential: good judge of 
personality 
Stubborn: good judge of 
personality 


Capable: good judge of 
ersonality 
Effeminate: appreciative 
Hostile: unappreciative 
Influential: sex interest 


Self-description scores 


Self-description scores 
P 


None 


Touchy: scrupulous 


Note.—The polarity of trait labels has been changed when- 
ever necessary so that all relations are positive. 
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TABLE 3 


MEAN PROFILE CORRELATIONS BETWEEN DOUBLE- 
STANDARDIZED REPUTATION AND SELF- 
DESCRIPTION SCORES or FIRST-CHOICE 

RECIPROCAL FRIENDS AND 
Ranpom Pairs 


Variables Friend Random t 
intercorrelated pairs Pairs 
Female pairs (V=53) 
Ra and Rg 27 .00 349r 
Sa and Sp —.03 .02 1.17* 
Ra and Sa 23 23 
Rs and Sp 23 23 
Ra and Sp .03 01 
Rs and Sa 04 .05 
Male pairs (N —44) 
Ra and Rg X —.03 2.19% 
Sa and Sp 02 04 
R4 and S4 .32 .32 
Rs and Sp 30 29 
Ra and Sg " .01 —.05 1.19* 
Rs and Sa Bi .00 1.70* 
Note,.—R = reputation scores, S = self-description scores, 
Aand B = members of pairs. 
* p.05. 
Hp «.05. 
** 5 < 005, 


and the cross-trait analyses’ presented above 
show consistent evidence of reputational simi- 
larity between friends. The slightly stronger 
evidence for females may only reflect the 
higher reliability of their trait scores. Though 
the evidence for self-descriptions remains 
much less compelling, it might be argued that 
the fault lies in the procedure used to con- 
struct friendship pairs. Maximizing the num- 
ber of pairs necessarily included many pairs 
with weak friendship ties, Personality rela- 
tions may be more likely between persons 
who have strong, enduring, interpersonal rela- 
tionships, 

To examine this possibility, double-stand- 
ardized trait scores from all reciprocal first- 
choice friendship pairs were further analyzed. 
Pair members were randomly assigned to one 
of two groups. A profile (Q) correlation was 
computed separately for each of the 53 female 
pairs and 44 male pairs, and then averaged 
across the two sets of pairs. This was done for 
all combinations of reputation scores, self- 
description scores, and “A” and “B” members 
of pairs. For comparison, random same-sexed 
pairs were also constructed and profile cor- 
relations likewise computed and averaged 
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across pairs for all combinations of scores 
and groups.* 

The average correlations are presented in 
Table 3. The consistent positive profile cor- 
relation between individuals’ reputations and 
self-descriptions of .23 for females and .31 for 
males, is similar to the previously discussed 
average of the validity diagonal correlations 
between reputation and self-description scores. 

As expected, the reputations of randomly 
paired persons were unrelated. Comparison of 
the reputations of first-choice reciprocal 
friends to randomly paired persons gave the 
expected outcome of greater similarity for 
friends. This was true for both females (¢ = 
3.49, p< .005) and males (¢=2.18, p< 
.05). In contrast, for both females and males 
the average self-description correlations of 
friendship pairs as well as randomly paired 
persons, hover around 0. These results are 
essentially in agreement with the previously 
reported outcome for the larger sample. Repu- 
tational similarity is found for friends, and 
is somewhat stronger among females. Self-de- 
scriptions were unrelated. 

A similar set of analyses were performed on 
the raw reputation and self-description scores. 
The only change in outcome was an increase 
in the reputation and self-description corre- 
lations of friends by about .20. The correla- 
tions between random pairs as well as the 
correlation between reputation and self-de- 
scription remained essentially unchanged. 
This outcome argues for the usefulness of the 
double-standardization procedure, However, it 
is to be noted that on the trait general un- 
favorability, derived from the overall favora- 
bility of a subject’s reputation or self-descrip- 
tion, friends’ scores were unrelated (see Table 
1). This was true for both the entire sample 
and the subsample of first-choice reciprocals, 
This suggests that it is the omission of the 
second standardization which is most respon- 
sible for the higher profile correlations ob- 
tained with raw scores. This difference may 
also be due to the lower reliabilities of the 
double-standardized scores (see Footnote 6). 


3 Again, the double-standardization procedure 
would ideally have been redone on the reputation 
and self-description scores of this subsample. How- 
ever, the expense precluded this. 


Trait correlation analyses (r) of first-choice 
nonreciprocal pairs. The source of reputa- 
tional similarity in the analysis of the double- 
standardized first-choice reciprocal pairs re- 
mains ambiguous. The outcome for reputa- 
tions may reflect true similarity which due to 
contrast judgment error fails to appear in self- 
descriptions. On the other hand, the self- 
descriptions may reffect the truth, and simi- 
larity found in reputations thay reflect simi- 
larity judgment error. The choice between 
these alternatives was evaluated by selecting 
all pairs among males where a first choice was 
totally unreciprocated by, the chosen person. 
it is assumed that nonreciprocal pairs possess 
two important properties that will permit in- 
terpretation of the source of friends’ reputa- 
tional similarity, First, it is assumed that the 
chooser does indeed see personality traits in 
his choice that attract him, Second, it is as- 
sumed that since the target of friendship does 
not list the chooser among his own five 
friends, no real friendship exists between 
them. Thus, any relation between traits can- 
not be attributed to judgment errors in the 
direction of assuming similarity between peo- 
ple who are friends. 

Only 23 such pairs could be found among 
males? Product-moment correlations between 
the 28 double-standardized reputation scores 
of chooser and chosen are presented in the 
last column of Table 1. While the range is 
larger than that found for the entire sample 
of friends, none reach significance and the 
average of .018 is remarkably close to 0. The 
six traits for which significant reputational 
similarity was found among the entire sample 
of reciprocal friends provide no confirmation 
whatsoever: bossy, — .08; masculine, — .30; 
hostile, + .06; uninfluential, — .13; obsequi- 
ous, + .02; sex interest, — .11. 

Trait correlation analyses (x) of first-choice 
reciprocal pairs. These negative results for 
nonreciprocal first-choice friendship pairs 
contrast interestingly with the within-trait 
correlations for reciprocal first-choice pairs 
(see Columns 5 and 6 of Table 1). These lat- 
ter correlations are based on the double- 


? Since there was no relation between male non- 
reciprocal pairs, and since there were substantially 
fewer pairs of nonreciprocal females, correlations 
for females were not computed. 
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standardized reputation scores from the 44 
male pairs and the 53 female pairs for whom 
mean profile correlations were presented in 
Table 3. Six correlations are significant be- 
yond the 5% level for males, eight for fe- 
males. As expected, the average correlations 
of, .14 afid .20 for, males and females are 
higher but not too dissimilar from the average 
values of .11 and .16 abtained from the entire 
sample (see Table 3). 


DISCUSSION 
In summary, the findings are as follows: 


1. For a large sample of reciprocal friend; 
ship pairs, in which many of the friendship 
ties may have been weak, there is reputa- 
tional similarity on a variety of traits, most 
«f which fall on a dimension of effectiveness. 
Cross-trait correlations were also interpreted 
as indicating reputational similarity. 

2. In spite of a few significant correlations, 
self-descriptions for this same set of friend- 
ship pairs were essentially unrelated. 

3. Profile correlátions for a subsample of 
closer friends. (reciprocal first-choice) con- 
firmed these findings. For both females and 
males, pairs of close friends had a similar 
profile of reputations whereas random pair- 
ing of the same individuals yielded no rela- 
tion.°The magnitude of reputational correla- 
tions for individual traits was greater for 
close friends. The self-descriptions of these 
close friends were unrelated. If raw reputa- 
tion and self-descriptions are used, the cor- 
relations are substantially higher. 

4, The trait general unfavorability was not 
one of those found to be significantly similar 
between friends. Even among first-choice 
reciprocals the general unfavorability of 
friends remains unrelated, Had this fact been 
known prior to the major analyses it could 
have been used 4s an argument against the 
first standardization of the double-standardi- 
zation procedure. However, the authors lacked 
such prescience and the a priori logic of such 
standardization was sound. 

Together, these results suggest that friends 
have similar reputations. The negative find- 
ings for nonreciprocal first-choice pairs sug- 
gest that “true” similarity in personality does 
not affect choice of the persons with whom 


one develops friendships. The obtained repu- 
tational similarity could stem from learning 
during the interaction entailed by friendship. 
To assume, however, that friends do indeed 
become similar in personality, raises the ques- 
tion of why similarity fails to appear in self- 
descriptions. That self-descriptions are inac- 
curate is not a solution. If friends are acquir- 
ing one another’s traits through interaction, 
they should acquire the same defenses and 
error tendencies as well. This would lead to 
discrepancy between self-descriptions and 
reputations for individuals, but nevertheless, 
similarity between self-descriptions of friends, 

Another alternative is to interpret the repu- 
tational similarity as judgment error. Al- 
though the lack of correlation for general 
uRfavorability* argues against overall exag- 
geration of similarity between friends on the 
like-dislike dimension, the possibility of some 
type of judgment error on the part of raters 
appears to provide the most parsimonious in- 
terpretation of the results. Those who are 
friends have been shown to be indeed similar 
on numerous other dimensions such as atti- 
tudes, socioeconomic class, religion, values, 
interests, etc, (Burgess & Wallin, 1953; 
Byrne & Blaylock, 1963; Lindzey & Borgatta, 
1954; Richardson, 1939). With so many di- 
mensions on which true similarity exists, gen- 
eralization of similarity to personality trait 
dimensions could readily occur. The peers 
who provided the reputations may have mis- 
takenly rated those who navigate in space 
and time together and who tend to be similar 
on a variety of attitudinal, socioeconomic, 
interest, and skill dimensions, as also similar 
on a variety of personality dimensions. 

A few final points of caution are in order. 


1. It should be noted that friends have 
been found to be similar on objective meas- 
ures of intellect (Morton, 1960; Richardson, 
1939). Some of the traits for which signifi- 
cant similarity was found appear to be loaded 
on such a factor (e.g., influential, good judge 
of personality). In fact, one of the four in- 
stances of a similarity correlation among self- 
descriptions was for the trait influential. Thus, 
some significant relations may indeed reflect 
true similarity on this dimension. 
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2. This study deals solely with same-sexed 
friends. It is quite possible that the intensity 
and stability of cross-sexed friendships (e.g., 
married couples) is rarely matched in the 
same-sexed pairs used in the present study. 

3. Reciprocal reinforcement may occur more 
frequently in relation to other trait dimen- 
sions than those sampled here. Thus, these 
results do not preclude the possibility of dis- 
covering personality relations on other dimen- 
sions. 

4. In different friendship pairs reciprocal re- 
inforcement patterns may develop for dif- 
ferent pairs of traits. Thus, relations between 
the personalities of friends may remain con- 
sistent over time and indeed depend on trait- 
related reinforcement mechanisms, yet differ- 
ent combinations of traits máy be idiosya- 
cratically salient to the maintenance of each 
friendship. 
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Ina partial test of the role-playing theory of hypnosis, this study examined 


the relationship of hypnotic susceptibility, as measured by the Childrens Hyp- 
notic Susceptibility Scale (CHSS), to 2 measures of role-playing ability in 
children. 21 girls and 21 boys between 7 and 12 yr. old, all volunteered by 
their parents, were tested in 2 sessions. In the 1st, tests of verbal intelligence 
and the CHSS were administered; in the 2nd, children performed on 2 
standardized tests of dramatic acting and hypnosis simulation ability, Results 
indicate a positive relationship between hypnotic susceptibility and the 
ability tg simulate hypnosis, but dramatic acting ability was generally un- 
related to susceptibility. Intelligence was significantly related to the suscepti- 
bility measures only, but not to the role-playing tests, while age and sex 
were not significantly related to any of the experimental measures. 


e The terms “role playing” and “role taking" 
have been used to name or explain activities 
ranging from the process of identification to 
the behavior of hypnotic subjects (Bandura 
& Walters, 1959; Cameron & Magaret, 1951; 
Jersild, 1950; Murphy, 1947; Piaget, 1951; 
Sarbin, 1950; Seats, Maccoby, & Levin, 
1957). ^ 

Sarbin (1950) and White (1941) describe 
hypnotic behavior as a result of the subject's 
assumption of the role of a hypnotized person. 
They thereby imply an explanation of suscep- 
tibilit$ in terms of role-playing abilities. Orne 
(1959) demonstrated that prior knowledge 
about hypnosis influenced adult hypnotic be- 
havior and that positive responses to hyp- 
notic suggestions failed to differentiate per- 
sons faking hypnotic susceptibility from bona 
fide susceptibles. Corresponding observations 
were madé by London (1962) of child 
susceptibility. 

Recent experiments have demonstrated the 
complex nature of susceptibility and role play- 
ing in children (Bowers & London, 1965; 
Stukat, 1958; ‘Promater, 1961; Weitzen- 
hoffer, 1953). Interpretations of susceptibility 
which implicitly support the role-playing view 
are presented by Hilgard, Weitzenhoffer, 
Landes, and Moore (1961), and by Faw and 
Wilcox (1958). 


l'This investigation was supported by Research 
Grant MH 08598, from the National Institute of 
Mental Health, United States Public Health Service, 
Perry London, principal investigator. 
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It has long been known that children are 
more suggestible than adults, with peak sug- 
gestibility occurring somewhere between 7 and 
14 years (Boring, 1950; Coffin, 1941; Hilgard 
et al, 1961; Hull, 1933; Hurlock, 1930), but 
early studies generally suffered two major 
flaws: failure to define suggestibility or sus- 
ceptibility carefully and/or lack of precise 
measures which suited their definitions (Lon- 
don, 1962). 

Current quantitative studies of hypnotic 
susceptibility (London, 1962, 1963; Moore & 
Lauer, 1963; Tromater, 1961) indicate that 
children's scores tend to average almost twice 
as high as adult scores based on similar items. 
The most comprehensive recent work found 
that susceptibility increases rapidly from age 
5 until about age 8 with a slight further rise 
to age 12, then levels off until 16, at which 
time a slight drop occurs (London, 1965). 

Role theory and empirical observation alike 
suggest that true hypnotic susceptibility may 
be connected with the ability to simulate hyp- 
nosis, but even if this were true, it would 
not demonstrate that role-playing ability in 
general accounted for susceptibility. One form 
of role playing might not be like another. 
London and Bowers (1964) devised two reli- 
able standard role-playing instruments, one 
measuring dramatic acting ability in general 
and the other hypnotic simulation ability in 
particular? and examined the relative abilities 

? The test and its scoring manual have been de- 
posited with the American Documentation Institute, 
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of 40 children, aged 5 through 11, to perform 
on them (Bowers & London, 1965). Not sur- 
prisingly, they found that age and intelligence 
accounted for most of the variance in per- 
formance, but when these variables were held 
statistically constant, the correlation between 
measures was nil. The two behaviors, both 
forms of role playing, are somewhat inde- 
pendent of each other. The developmental 
character of role-playing ability, as demon- 
strated in their study, does imply some rela- 
tionship between hypnotic susceptibility and 
the ability to simulate it, but no direct com- 
parisons of susceptibility and role playing 
were made. The purpose of the present study 
was to compare dramatic acting and hypnotic 
simulation directly with hypnotic susceptibil- 
ity. On the basis of Bowers and London’s«re- 
sults, it was hypothesized that: 

1. Hypnotic susceptibility is positively re- 
lated both to the ability to perform a dramatic 
acting role and to simulate hypnosis. 

2. Hypnotic susceptibility is more strongly 
related to dramatic acting than to hypnotic 
simulation. 

3. The degree of relationship between sus- 
ceptibility, acting, and simulation is unrelated 
to age or intelligence. 


METHOD 
Subjects 


Subjects were 21 girls and 21 boys ranging in age 
from 7.8 to 11.9, the peak years for hypnotic sus- 
ceptibility. Subjects were obtained from parents who 
responded favorably to a request for volunteers 
mailed to parents of public school children in 
Urbana, Illinois. The sample corresponds socioeco- 
nomically and educationally to current census data 
for Urbana-Champaign (United States Bureau of the 
Census, 1960) but is higher than the general United 
States population. 

The sample was not strictly random as only volun- 
teers were used. In most cases, however, the children 
themselves were not volunteers; their parents volun- 
teered them. The precise sampling error which this 
situation creates is unknown. 


Order Document No. 8260 from ADI Auxiliary Pub- 
lications Project, Photoduplication Service, Library 
of Congress, Washington, D. C. 20540. Remit in 
advance $1.75 for microfilm or $2.50 for photocopies 
and make checks payable to: Chief, Photoduplication 
Service, Library of Congress. Copies of the test may 
also be obtained on request from Perry London, 
Department of Psychology, University of Southern 
California, Los Angeles 90007. 


Instruments 


The Childrens Hypnotic Susceptibility Scale 
(CHSS) was used as the hypnotic criterion measure 
(London, 1963). The CHSS was adapted from the 
Stanford Hypnotic Susceptibility Scale (Weitzen- 
hoffer & Hilgard, 1959). A 22-item scale, Part I 
of the CHSS is almost identical in content to Form A 
of the Stanford scale, while Part II ‘contains 10 
items, largely selected from other Stanford scales. 
Interobserver reliabilitjes range from .90 to .96 
(London, 1962). Retest reliability for a second test 
administered by a different experimenter is .92. 

Part I of the CHSS requires subjects to give 
mostly motor responses, while the items which com- 
prise Part II are scored from the subjects' reports 
of their subjective experiences. Each child was scored 
for overt behavior, subjective involvement, and total 
^susceptibility. Overt behavior (range: 0-66) was 
scored on a 4-point scale (0-3). Each item receiving 
a passing score on overt behavior (2 or 3) was also 
assigned a rating of A, B, or C, later converted to 
1, 2, or 3, to indicate the apparent subjective reality 
(range: 22-66) to the child of'his hypnotic experi- 
ence. These scores reflect the experimenter's clinical 
impression that the child was "faking," "partially 
involved" (or indeterminate involvement), or “deeply 
involved" in the items. Total scores (range: 0-198) 
were computed by first multiplying overt behavior 
by involvement, for each item, then summing the 
products over all items (London, 1965). 

Dramatic Acting Test. The experimenter describes 
a series of situations to the child, assigns himself 
and the child specific characters (roles), and provides 
standardized lines, while the subject invents his 
own responses. This test is in effect a type of com- 
media del l'arte theatre in which the subject assumes 
a series of familiar roles: mother (the expertinenter 
plays a child who has broken a lamp), father (the 
experimenter is a child whose teacher had complained 
to the father regarding school behavior and low 
grades), friend (the experimenter plays a peer who 
has lost $10), bully (the experimenter is a younger 
child who desires to play ball with the bully’s team), 
teacher (the experimenter is a whining “tattletale” 
who complains about the behavior of a classmate), 
and sheriff (the experimenter plays the robber). The 
experimenter’s lines are standard, while the sub- 
ject must invent appropriate responses (London & 
Bowers, 1964). 

A high score on thesDramatic Acting Test reflects 
the adoption of an attitude consistent with the 
cultural stereotype of the role, indicated by a plausi- 
ble sequence of lines which incorporate this attitude 
with the specific situation of the play, especially the 
Preceding stimulus line of the experimenter. The 
content of the subject’s lines was rated on a 4-point 
scale: no role adoption, lack of plausible sequence 
or adequate role adoption, moderately plausible 
sequence, and satisfactory role adoption. Each line 
is scored. The average line scores for each role are 
summed for the final tabulation. The reliabilities of 
the scoring procedure were assessed by Bowers and 
London (1965). Interscorer product-moment cor- 
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relations ranged from .80 to .87. Scorer reliabilities 
for the present study ranged from .84 to .93. 

Hypnotic Simulation Test. Simulation of hypnosis 
was measured using nine items from Part II of the 
CHSS which were scored for overt behavior only. 
Hypnoidal instructions were removed and a special 
positively motivating initial instruction was added 
(Bowers & London, 1965). 

Wechsler Intelligence*Scale for Children (WISC), 
Vocabulary subtest. The Vocabulary subtest was 
administered to all subjects.eIt correlates between 


-.79 and .89 with ia verbal age (8-11 year age 


range, Wechsler, 1949). A verbal IQ was extrapolated 


for analysis, 


Procedure . 


Subjects were tested inglividually in two 90-minute 
sessions 7 days apart. Parents were encouraged to 
watch the proceedings through a one-way vision 
screen, During the frst experimental session, the 
WISC verbal subtest and the complete CHSS were 
administered by one experimenter while another ob- 
served, During the second session, two experimenters 
who had no previous contact with the subject 
administered the Dramatic Acting Test and the 
Hypnotic Simulation Test while raters observed 
through one-way vision screens, 

The Dramatic Acting Test was always adminis- 
tered first, after which «he experimenter introduced 
the Hypnotic Simulation Test as "another acting 
job." The subject yas instructed to act like a hypno- 
tized person in order to fool another person who 
was going to come into the room. The subject was 
told that the new experimenter would not know 
whether or not he was really hypnotized and might 
be suspicious because he knew that sometimes the 
children*were hypnotized and sometimes only pre- 
tending. The subject was further warned that if, 
during the session, the experimenter became sus- 
picious, he would immediately stop the proceedings, 
but as long as the experimenter continued making 
suggestions, the subject could be assured that his 
ruse was succeeding. 


. RESULTS 


Hypnotic susceptibility was separately 
measured by each of the three highly corre- 
lated CHSS scores. Susceptibility was hy- 
pothesized to relate positively both to the 
ability to perform dramatic acting roles and 
to simulate hypnosis. This hypothesis was 
partially confirmed. All correlations between 
the Hypnotic Simulation Test and the three 
measures of susceptibility differed signifi- 
cantly from zero (p < .005, Table 1), but 
the predicted positive relationghip between 
dramatic acting and hypnotic susceptibility 
was not substantiated. Though subjective in- 
volvement and total score were positively cor- 
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TABLE 1 


PEARSON PRopucT-MoMENT CORRELATIONS BETWEEN 
Hypnotic SIMULATION, DRAMATIC ACTING, AND 
Hypnotic Suscepripitiry (WV = 42) 


Sı tibilit: Di ti i i 
pun pes y oes Simulation 
Overt behavior —.02 .64* 
Involvement 2 -48* 
Total 213 .60* 


* p <.005, one tailed test. 


related with dramatic acting, the relationships 
were very small and undependable. 

The second hypothesis predicted that the 
Dramatic Acting Test score would be more 
strongly related to hypnotic susceptibility 
than would Hypnotic Simulation Test scores. 
This hypothesis «vas not supported. The very 
opposite might, in fact, be the true relation- 
ship between the variables, since simulation 
scores correlate significantly with the suscepti- 
bility measures, but dramatic acting scores 
do not (Table 1). Further testing of this hy- 
pothesis was made by examining the signifi- 
cance of the differences between the two sets 
of correlations. A test for nonindependent cor- 
relations indicates that the correlation of act- 
ing with total score is significantly lower than 
that of simulation with total (p < .01). The 
difference between the correlation of involve- 
ment with dramatic acting and with simula- 
tion is not significant. Two of the three sus- 
ceptibility measures are thus differentially 
correlated with dramatic acting and simula- 
tion, and in the opposite direction from that 
predicted by the hypothesis (Table 2). 

The effects of age and intelligence on the 
relationships between susceptibility, acting, 
and simulation were thought to be inconse- 


TABLE 2 


ANALYSIS OF DIFFERENCES BETWEEN CORRELATIONS 
OF SUSCEPTIBILITY MEASURES WITH 
SIMULATION AND ACTING (N = 42) 


Variable r | Difference t 
Acting with total score A3 
Simulation with Pint -60 46 2.08* 
Acting with overt behavior —.02) 
Simulation with overt behavior | .64| 66 2.64%" 
Acting with involvement 21 27 76 
Simulation with involvement 48| -* " 
* 05. 
+ 2 SOL 
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TABLE 3 


PRODUCT-MOMENT INTERCORRELATION MATRIX, MEANS, AND STANDARD DEVIATIONS 
OF EXPERIMENTAL VARIABLES (N = 42) 


boyet | Involve: | Total | Simulation | Acting 1Q Age Sex 
Overt behavior .63** 3€ 64k —.02 24 —.29 18 
Involvement 94er 48 21 ATR 08 ' 16 
Total 60" 13 Ag —.02 464 
Simulation —.05 d = 8 = 5 
Acti ou E à 
Tone , —.07 KS 
Age .06 
M 45.7 42.7 103.7 21.2 17.6 115.9 10.3 E 
SD 15.8 13.70 51.5 43 2.6 174 2.1.0 5 


* p «.05, two-tailed test. 
** > <.005, two-tailed test. 


quential. Our results clearly support this hy- 
pothesis concerning age but are more equivo- 
cal in relation to intelligence. Age was not 
significantly correlated with any of the ex- 
perimental variables (Table 3), either singly 
or in combination with other variables. When 
the effects of age variance are removed by 
partial correlational techniques, moreover, the 
interrelationships of susceptibility and role- 
playing measures are effectively unaltered. 
Extrapolated verbal IQ, on the other hand, is 
significantly correlated with two susceptibility 
measures, subjective involvement and total, 
though apparently unrelated to overt behavior 
or to either role-playing test (Table 3). With 
the effects of verbal intelligence held statis- 
tically constant, partial correlations still indi- 
cate that the interrelationships of experimental 
variables are independent of it. Finally, there 
were no significant differences in the degree 
of correlation between IQ and any of the ex- 
perimental variables. While intelligence con- 
tributes positively to performance on some 
measures, it does not do so in very great de- 
gree to any. 


TABLE 4 D 


PARTIAL CORRELATIONS OF EXPERIMENTAL VARIABLES WITH EFFECTS OF AGE AND IQ 
Hzrp STATISTICALLY Constant (N = 42) 


An additional test was made by computing 
the partial intercorrelations of experimental 
variables with age and IQ both held statisti- . 
cally constant (Table 4). Their relationships 
remained virtually the same, tending again to 
confirm the third hypothesis of this study, that 
age and intelligence do not affect the degree of 
the relationships between simulation, dramatic 
acting, and hypnotic susceptibility. 


Comparing Sample Differences 


Since the analysis of our data confirms 
Bowers and London’s observation that acting 
and simulation are apparently unrelated to 
each other, their data were reanalyzed for 
these two variables for 8-11 year olds only. 
Despite the fact that the subjects in their 
study participated in the acting and simula- 
tion tests with no known prior exposure to 
hypnosis, a £ test failed to detect significant 
differences in simulation scores for the two 
samples. y 

For our subjects, all of whom did partici- 
pate in hypnosis prior to administration of the 
Hypnotic Simulation Test there was a signifi- 


Overt behavior | Involvement Total Simulation Acting Sex 
Overt behavior .67* .86* .63* —.04 —.27 
Involvement .93* A3* .06 —.02 
Total .56* .00 —4A3 
Simulation ten) —.20 
Acting —.09 


* p <.005, two-tailed test. 


cant performance gain from overt behavior 
susceptibility scores to subsequent simulation 
scores for the same nine items (¢ test for cor- 
related means, p< .001). When subjects 
were then divided into high-, middle-, and 
low-susceptible groups on the basis of total 
CHSS scores, and mean differences from hyp- 
notic to simulation session were compared 
across groups, the increase from CHSS to 
simulation scores was accounted for almost 
entirely by the low-susceptible group. All 


` * three groups retained their mean relative posi- 


tions from one session to the other; but the 
mean difference score for the low susceptibles 
between CHSS and simulation (for nine 
items) was a 5.7 increase from hypnotic to 
simulation session, fór the mediums, 1.4, and 
for the highs, .1; lows gained significantly 
mofe than mediums'and highs. The gamma 
correlation coefficient, a variant of rank-order 
correlation (tau), between simulation score 
ranks and the parallel CHSS score ranks was 
45 (Hayes, 1963, pp. 655-666). 

The comparison between same age samples 
on the Dramatic Acting Test revealed a sig- 
nificantly higher score for the Bowers and 
London subjects (p < .001; £ test for inde- 
pendent variance estimates was employed as 
the F test indicated a rejection of the 
hypothesis). 


Discussion 


The most striking conclusion available from 
the data of this study is that hypnotic sus- 
. Ceptibility in children, regardless of how it is 
measured, is virtually unrelated to dramatic 
acting ability and is quite definitely related 
to, but not identical with, the ability to simu- 
late hypnosis. That the Dramatic Acting and 
Hypnotic Simulation tests ase not only reli- 
able, but also draw Jupon different kinds of 
role-playing abilities, seems quite apparent 
both from thé present results and those ob- 
tained previously on a different sample. Within 
the age range ‘where susceptibility is highest, 
moreover, its relationship or lack of it to these 
forms of role playing is unaffected by the age, 
sex, or intelligence of the child. * 
It is of some interest that the ability to 
simulate hypnosis bears about the same de- 
gree of relationship to the different aspects 
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of susceptibility as these measures bear to 
each other. The somewhat higher correlation 
of simulation with overt behavior (.64) than 
with subjective involvement (.48) may merely 
reflect the fact that the Hypnotic Simulation 
Test was scored only for overt behavior in the 
present study. That hypnotic simulation is 
by no means identical with either aspect of 
susceptibility is indicated not only by these 
correlations, but also by its relatively smaller 
correlation with total susceptibility score (.60) 
than is manifest for either overt behavior 
(.83) or subjective involvement (.94). The 
correlation of overt behavior on Part I of 
'CHSS with simulation (.61), moreover, fur- 
ther precludes any interpretation of the rela- 
tionship between susceptibility and simulation 
as a Tunction merely of the repetition of the 
items of Part II in the simulation condition. 

Considering the content and scoring proce- 
dures for the role-playing tests, it is easy to 
understand why hypnotic simulation is more 
related to hypnotic susceptibility than is dra- 
matic acting, especially in view of Sarbin's 
(1950) very definition of hypnosis as "acting 
the role of the hypnotized person." It is not 
so obvious, on the other hand, why dramatic 
acting should appear totally unrelated to sus- 
ceptibility unless the role-playing component 
of hypnotic performance is so small or spe- 
cialized as to be unrecognizable in relation to 
other, less tautological meanings of role play- 
ing. Some indirect evidence to this effect may 
be deduced from the fact that subjective in- 
volvement is the single hypnotic susceptibility 
score which comes close to being a direct 
measure of role-playing ability; it represents 
the ability to meet the stereotypes of the ob- 
server's understanding of what “sincere” hyp- 
notic responses should look like, over and 
above mere acquiescence to hypnotic sugges- 
tions. It is perhaps meaningful, therefore, that 
dramatic acting is more highly correlated 
with subjective involvement than with other 
susceptibility scores, though not significantly 
so, and that the difference between the corre- 
lations of simulation and of dramatic acting 
with involvement (.27) was insignificant, 
while the differences between the correlations 
of these measures with overt behavior (.66) 
and with total score (.47) were very much 
so (? < .01). Whatever aspect of role play- 
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ing is involved in subjective involvement may 
relate to dramatic acting, hypnotic simula- 
tion, and overt behavior in a hypnotic situ- 
ation, but none of these forms of role be- 
havior is sufficient to account very well for 
the ability to respond to hypnosis. 

Even more important, perhaps, is the fact 
that performance on neither role-playing 
measure will account even for ability to re- 
spond to the other. The chief difficulty in re- 
lating susceptibility to role playing may thus 
be a function of the complex nature of the 
latter. In our study, dramatic acting is appar- 
ently no more related to hypnotic simulation 
than it is to any of the CHSS measures. This’ 
result, which replicates the unexpected finding 
of Bowers and London (1965) does not, of 
course, negate the fact that both tests meàsure 
role-playing abilities. It suggests rather, as 
they conclude, 


that role playing is not a single skill but is a 
construct describing a complex of response styles 
determined largely by the stimulus context which 
elicits them. In the case of the role playing measures 
used here, differences in instructions, types of roles 
to be played, and suggested motivations all contain 
possibilities for crucially influencing the internal set 
of Ss so that the quality of role playing which 
results from the two sets might be totally unrelated. 

The Dramatic Acting situation, by virtue of the 
necessary spontaneity of lines demanded, the famil- 
iarity of each of the roles to the child, and the 
fact that the acting is shared between E and S, 
exerts a strong pull on the child to enter the role 
honestly, completely, and without a self-critical set. 
The role and the self need not be kept distinct from 
each other. He is not fooling anyone, and can freely 
and relatively uncritically put himself in another's 
shoes. The structure of the Hypnosis Simulation test, 
on the other hand, pulls for a strongly self-critical 
set; S’s job is to fool an apparently sincere E. The 
role and self must be kept carefully distinct, so 
that a critical attitude can guide self-conscious modi- 
fications in behavior. He cannot put his “whole 
self” into the role because the self of which he is 
aware is precisely what he must try to hide. To 
do this, he must scrutinize each act and use his 
intelligence quite deliberately. 


If hypnotic susceptibility itself reflects still 
another kind of role-playing ability in which 
critical faculties are presumably laid aside, it 
is understandable that possessing high sus- 
ceptibility would not do much for one’s pro- 
ficiency at pretending to be hypnotized when 
he is not. Low-susceptible children gained ef- 
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fective role-playing experience by being hyp- 
notized in a session preceding the acting and 
simulation tests, but highly susceptible chil- 
dren not only failed to increase their scores, 
but generally declined somewhat from the 
true to the false hypnotic sessions (11 of 15 
scores decreased). Whether they were so 
deeply involved hypnotically in the first ses- 
sion that it offered no rehearsal value for the 
second, or whether the demand for simulation 
in the second session had some especially in- 
hibiting effect on these children cannot be de- 
termined from the data at hand, since the 
hypnotic session always preceded the acting, 
session. In any case, the reversal across ses- 
sions of the highest and lowest susceptibility 
groups precludes any siniple interpretation of 
the relationship of hypnotic susceptibility to 
role-playing ability. Understanding the coritri- 
bution of rehearsal effects from one session to 
another requires a study in which the role- 
playing measures are administered prior to 
the CHSS; this study is now in progress. 
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19 male Es employing a Taffel-type task conducted a verbal conditioning 
experiment with 60 female Ss. 3 the Es were led to expect their Ss to show 
verbal conditioning, and 3 were led to expect no verbal conditioning. 3 the 
Es in each of these groups were led to feel that it would be desirable if their 
Ss showed conditioning, and 3 were led to feel that it would be undesirable. 
Those Es who (a) both wanted and expected, and (b) neither wanted nor 
expected their Ss to show increased use of 1 and we pronouns obtained sig- 
nificant conditioning (p = .001). Thosc Es who (a) wanted but did’ not expect, 
and (b) expected but did not want increased use of r and we pronouns ob- 
tained no significant conditioning (p= 1.00). Ss high in need for social ap- 
proval arrived earlier at the site of the experiment, were less “aware” of the 
contingency but were no more likely to show conditioning. Ss' ratings of Es' 
behavior during the experiment showed significant differences between Es in 
different experimental conditions, between Es who were Ist vs. later born, 


and between Es who were high vs. low in need for social approval. 


A series of experiments has recently been 
reported which suggests that for a variety of 
experimenters, subjects, and situations, the 
experimenter's expectancy or hypothesis may 
be a significant partial determinant of the 
results he obtains (Rosenthal, 1964). In most 
of our studies, the particular data experi- 
menters were led to expect were presumably 
also desired by them. The effects of experi- 
menter expectancy have therefore been con- 
founded with those of desirability. This seems 
reasonable from the standpoint of ecological 
validity. “Real?” experimenters ordinarily 
want to obtain data they expect and do not 
want to obtain data they do not expect. 
Nevertheless one can think of research situa- 
tions wherein experimenters expect to obtain 
data they do not consider desirable for any 
of several reasons (e.g, Milgram, 1963). 
Moreover, one can also think of situations 
wherein experimenters do not really expect 
to obtain data which would be highly desira- 
ble (ie., “long-shot” studies). 

The general purpose of the present investi- 
gation, then, was to study the separate and 
combined effects on research findings of the 
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Grants G-24826 and GS-177 from the Division of 
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experimenter's expectancy of certain results 
and the desirability to the experimenter 0 
those results. A more Specific purpose of the 
present study was to learn whether experi- 


partial determinants of the results of studi 
of verbal conditioning. An earlier study 


might be determinants of the degrée of his 
subjects’ awareness that they had undergone 
a conditioning procedure. That study did not, 
however, vary the experimenter’s expectang} 
of whether conditioning would or would not 
occur; nor did it involve any manipulation 
of the outcome desirability of either subjects! 
awareness or subjects’ conditioning score 
(Rosenthal, Persinger, Vikan-Kline, & Fode 
1963). 

Accordingly, half our experimenters weri 
led to expect that their subjects would sho 


well on hae while the remainin 
experimenters were led to feel that their sub 
jects’ verbal conditioning would reflect badl} 
on the experimenter. j 


A EXPECTANCY AND RESULTS OF RESEARCH 


METHOD 
Experimenters 


Nineteen male graduate students served- as paid 
volunteer experimenters. All were in their first 2 
years of work at Harvard’s Division of Engineering 
and Applied Physics. 

ee 


Sub jects 


The experimenters ran a total of 60 paid volunteer 
subjects, all of them® female students at a .Boston 
secretarial school. The experimental task was ex- 
plained to the subjects as a test of verbal facility. 


Experimental Task 


The experimental tas was a modified Taffel 
(1955) procedure, The experimenters presented their 
subjects with four sets of 20 verbs, each of which 
was to be used in a. sentence constructed by the 
subject. Each sentence was to begin with any of 
the® following pronourfs: 3, wr, vov, HE, SHE, and 
THEY. In order to reduce any errors of observation 
or recording, subjects wrote down each sentence 
and then read it aloud to the experimenter who 
checked to be sure that the subject had read the 
same pronoun she had recorded. On the last 60 
trials (three blocks of 20), the experimenters said 
“good” whenever 1 or wf was the pronoun selected. 
The dependent variable was a measure of the in- 
crease in the use of r or we from the operant level 
to the subsequent blocks of trials, 


Procedure T 

The experiment was conducted in one day at two 
different hours. All experimental conditions were 
represented in each session. The experimenters in 
each session were trained as a group. They were 
given factual material about the phenomenon of 
verbal conditioning. Two reasons were advanced for 
their participation in this experiment. The first 
stressed the need for replication by researchers who 
«were not behavioral scientists in order to extend the 
generality of the findings in the verbal conditioning 
literature, The second stressed our interest in learn- 
ing more of the relationship between the experi- 
menters’ personality and their subjects’ condition- 
ing scores, All experimenters were then administered 
the Marlowe-Crowne Social Désirability scale, the 
Taylor Manifest Anxiety scale, a form for estab- 
lishing birth order, and the first set of 20 verbs they 
would present to their subjects. This was done to 
obtain the experimenters’ operant level for using 1 
and we and was obtained of course before the ex- 
Perimenters knew which pronouns they would be 
reinforcing. The experimenters were told that their 
subjects would be assigned them on the basis of 
the subjects’ similarity to the experimenters in 
personality as measured by the tests the experi- 
menters had taken. The same tests were in fact 
administered to the subjects, but assignment of 
subjects to experimenters was essentially random. 
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After each subject finished her experimental task 
and left her experimenter’s research room she filled 
out two questionnaires designed to define the degree 
of her awareness that she had undergone a verbal 
conditioning procedure. The first questionnaire (Q) 
simply asked the subject to state the purpose of the 
experiment (Matarazzo, Saslow, & Pareis, 1960). The 
second questionnaire (Q:) repeated the substance 
of the first but asked more specific and more leading 
questions (Levin, 1961). Both questionnaires had 
been modified for use in an earlier study (Rosenthal 
et al, 1963). Each subject also filled out a series of 
28 rating scales designed to assess her perception 
of her experimenter. Each scale had 20 points run- 
ning from +10 (e.g, extremely businesslike) to —10 
(e.g. extremely unbusinesslike) with intermediate 
labeled points. This same questionnaire had been 

«employed in earlier experiments (Rosenthal, Fode, 
Friedman, & Vikan-Kline, 1960). 


Experimental Conditions 


The experimental treatments were administered in 
the form of “last-minute instructions" placed on each 
experimenter's desk. For half the experimenters the 
instructions claimed that their subjects had person- 
ality characteristics such that they would condition 
well. The remaining experimenters were led to expect 
their subjects to condition poorly. The desirability 
of these two outcomes was implemented by telling 
half the experimenters that conditionability was 
highly correlated with general learning ability and 
by telling the others that it was highly correlated 
with susceptibility to deliberate manipulation. Since 
the experimenters believed themselves to be similar 
to their subjects in personality, the first group of 
experimenters should find good conditioning data a 
desirable outcome since it would imply that the 
experimenter, like his subjects, had* good general 
learning ability. The remaining experimenters should 
find good conditioning data an undesirable outcome 
since it would imply that the experimenter, like his 
subjects, was highly manipulatable. 

Those portions of the instructions used to imple- 
ment data desirability were as follows: 


[Desirable]: There are a few things we now 
know about subjects who are more and less condi- 
tionable. Our hope, of course, is to learn a good 
deal more about that. What we know so far 
Suggests that highly conditionable people tend 
to have high general learning ability, They pick 
up new concepts and ideas quickly and have skill 
in analyzing and solving problems. Poor condi- 
tioners, in contrast, tend to have lesser abilities 
in these areas. 

[Undesirable]: There are a few things we now 
know about subjects who are more and less condi- 
tionable. Our hope, of course, is to learn a great 
deal more about that. What we know so far 
suggests that highly conditionable people tend to 
be manipulatable. They are often like putty in the 
hands of advertisers and salesmen. Poor condi- 
tioners, in contrast, tend to be very resistant to 
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such manipulation; in other words, they seem to 
have minds of their own. 


Within each of the two above conditions of data 
desirability, half the experimenters received one of 
the following two additional instructions specifying 
expectancy: 


[Expect]: The particular subjects assigned to 
you, on the whole, tend to be good conditioners. 
That is, they will tend to show a significant in- 
crease in the number of *r" and “we” pronouns 
from the first set of 20 sentences to the later sets. 


[Don’t expect]: The particular subjects assigned 
to you will, on the whole, tend to be poor condi- 
tioners. That is, they will not tend to show a 
significant increase in the number of “r” and “we” 
pronouns from the first set of 20 sentences to the 
later sets. 


The method described above of implementing our 
outcome-desirability variable was selected om the 
basis of an instruction pretest with Harvard under- 
graduates from a course in motivation. These sub- 
jects received eight characterizations of the person- 
ality correlates of conditionability. Of these, four 
were designed to be desirable and four undesirable. 
The subjects were instructed to imagine themselves 
to be experimenters running an experiment on the 
personality correlates of verbal conditioning. Further, 
they were asked to imagine that their “subjects” 
were assigned to them on the basis of similarity 
(to themselves) on important personality dimensions 
“so that any interpretation of experimental results 
should apply to [them] ...as well as to [the] 
subjects." Under these role-playing conditions sub- 
jects were asked to rate the eight characterizations 
on five scales: the desirability of having highly 
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Fic. 1. Frequency of choice of 1 and we as a 
function of experimenter expectancy, outcome desira- 
bility, and trial block, 
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conditionable subjects, assuming the characterizations 
to be correct; the desirability of having subjects 
who were highly resistant to conditioning, assuming 
the characterizations to be correct; the probability 
that a competent psychologist would be right if he 
predicted their subjects to be highly conditionable; 
the probability that a competent psychologist would 
be right if he predicted their subjects>to be highly 
resistant to conditioning; the believability of the 
characterization if made by a competent psychologist. 

The two characterlzations used in the present 
experiment were selected because the patterns of 
ratings assigned them on our five scales were gen- 
erally superior to those of the other characteriza- 
tions. That is, our pretest subjects rated these two 
characterizations as relatively bélievable, and rela- 
tively desirable, in one case and undesirable in the 
other. Moreover, they expressed considerable readi- 
ness to concur with either of the two opposite 
predictions by a competent psychologist. 


Precautions against Authors’ Expectancy 
fects x 


We took a number of precautions to prevent our 
own outcome expectancies and outcome desires from 
having major effects on the data collected by the 
experimenters, These included: randomly assigning 
rooms to conditions; assigning experimenters to 
rooms in order of departure from the experimenter 
reception room; implementing the independent vari- 
ables in a way involving no~ contact with the 
experimenters by persons who were aware of the 
experimenters’ treatment conditions; assigning sub- 
jects to experimenters on the basis of (a) order of 
subjects’ departure from the subject reception room, 
(b) the experimenter's immediate availability to run 
a new subject, and (c) the number of subjects run by 
each experimenter up to that point. 

The above procedures left all the major investi- 
gators except one blind to each experimenter's treat- 
ment condition, The lone exception was the author 
who was in charge of assigning rooms to conditions 
and ensuring that approximately equal numbers of ex- 
perimenters and subjects were assigned to each treat- 
ment condition, He had no contact with either experi- 
menters or subjects during the course of the experi- 
ment. With the foregoing precautions, the likelihood 
that the major investigators! own expectancies and 
desires substantially affected the data seems small. 


Ef- 


RESUL?S 
Conditioning 


Initially four alternative definitions of con- 
ditioning were employed: increase in 1-WE 
usage from the operant level to Block 4; 
increase in 1-wE usage from the operant level 
to the mean’of the subsequent three blocks; 
increase in 1-WE usage from the operant level 
to Block 4 plus one-third the increase from 
Block 2 to Block 3; monotonicity of increase 
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of 1-WE usage from Blocks 1 to 4 as measured TABLE 1 

by rank correlation between number of 1-wE EXPERIMENTERS’ MEAN CONDITIONING SCORES 
responses and block number. The median or Four EXPERIMENTAL GROUPS 
intercorrelation of these dependent variables 

was .79. Because the first of these definitions Outcome desirability le 

was both the simplest and most highly cor- EC EE ERU 
related with the other definitions, it was - = 
accepted as the definition of magnitude of ae 20 19 


conditioning. ? 

Figure 1 shows the mean of the mean 
number of 1-wE responses obtained by experi- the analysis was limited to subjects who were 
menters in each ,of the four experimental either vaguely or clearly aware, 
groups for each*of the four blocks. Table 1 Aside from her experimenter's treatment 
shows the mean conditioning scores (Block 4 condition, two factors proved to be related 
— Block 1) and Table 2 the results of the * to the subject's subsequent awareness: the 
analysis of variance. Only the interaction was — subject's personality and the order in which 
significant. Apparefitly, then, neither experi- she was run by the experimenter. Subjects 
menter's expectancy nor the desirability of who’ scored as fore anxious (r= —.22, p 
cofiditioning data *alone affect the magni- =.10) and as higher in need for social ap- 
tude of conditioning scores reliably but the proval (r = —.30, p = .02) were less likely 
congruence between expectancy and data de- to become aware subsequently. Subjects run 
sirability does make a substantial difference, later by a given experimenter were more 
Under the congruent conditions 100% of the likely to become aware (r= .26, p= .05). 
experimenters showefdl a mean increase in In addition, all the subjects run in the second 
their subjects’ use of 1 and we (p= .001). experimental session were more likely to be- 
Under the incofigruent conditions only 56% come aware than subjects run in the first 
of the experimenters showed a mean increase session (p = .08). 

(p = 1.00). 


Subjects’ Perceptions of Experimenters 


Awareness : ^ A 
Subjects had rated their experimenters on 


o 
The awareness questionnaires were inde- 28 variables immediately after the p 
pendently and blindly scored by two of the ment, Those esférimanterg who had Ben tn 
authors on a 3-point scale: clearly unaware, the congruent experimental treatment groups 
1; vaguely aware, 2; and clearly aware of Were rated by their subjects as more casual 
the response-reinforcement contingency, 3. (r = 33, p = .01), more courteous (r = .27 
The reliabilities of Qi and Qs were .95 and 5 — 95), more pleasant (r= .24, f = .08), 


* E. respectively. more expressive-voiced (r — 24, p= .08), 
Of all subjects 17% were classed as clearly ang as less given to the use of movements of 
aware, 8% as vaguely aware, and 75% aS the trunk region (r= —.26, p= .05). Be 
clearly unaware (Qo). cause of the intercorrelations among these 


Experimenters who expected conditioning particular variables and among the total set 
tended to obtain a Jower rate of clear aware- 


ness (7%) than did experimenters who did TABLE 2 
not expect conditioning (2596). Because the Renita tor Vis CE on ECNE 
bulk of subjects in both experimenter ex- Mean CONDITIONING SCORES 
pectancy conditions were “clearly unaware,” — 
the difference in rates of clear awareness Source af MS F 
approached significance (p = .10) only when Expectancy (A) 1 0.18 
. Desirability (B) 1 0.06 

? Among subjects judged as clearly aware a few AXB T 22.26 10.55* 

specifically mentioned their decision to go along or Within 15 241 


not go along with their experimenters’ attempts 
to influence them. *p = .006. 
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TABLE 3 


SUBJECTS’? CONDITIONING SCORES AND THEIR 
PERCEPTION OF EXPERIMENTERS 


Variable r ? 
Interested A3 -001 
Businesslike 43 001 
Professional 33 .01 
Quiet (nonloud) 31 02 
Enthusiastic .28 04 
Behaved consistently 26 .05 
Expressive-voiced 24 .08 


of 28 variables, no simple statement is pos- 
sible of how many of these particular correla- 
tions might be attributed to chance. It seems 
likely, however, that the experimenters' be- 
havior during the experiment, as defined by 
their subjects’ ratings was at least in part 
determined by the experimental treatment 
conditions. 

Table 3 shows the correlations between the 
magnitude of the subjects' conditioning scores 
and their perceptions of their experimenters. 
We cannot be sure, of course, that these 
ratings of experimenters actually do reflect 
differences in experimenter behavior. It is 
also possible that those subjects who are more 
susceptible to the interpersonal influence of 
a reinforcing experimenter simply describe 
experimenter behavior differently. Or, having 
been influenced by a reinforcing experimenter, 
these subjects may have rated that experi- 
menter according to their preconceptions of 
the sort of person by whom they would permit 
themselves to be influenced. Assuming for the 
moment that these ratings accurately de- 
scribed experimenter behavior, subjects were 
more influenced by experimenters showing a 
general enthusiastic interest in them; convey- 
ing a consistent, professional, businesslike 
manner; and speaking in a quiet but expres- 
sive tone of voice. If these experimenters did 
not in fact behave in this way, at least it 
seems warranted to believe that more in- 
fluencible subjects ascribe such characteris- 
tics to the experimenters by whom they are 
influenced. 


Experimenter Characteristics 


Experimenters’ birth order, operant level of 
rand we, and need for social approval were 
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not related to their subjects’ conditioning 
scores. Experimenters’ anxiety scores were 
related to their subjects! conditioning in a 
nonlinear manner. Both high- and low- 
anxious experimenters obtained greater condi- 
tioning than did medium-anxious experi- 
menters (F = 3.08, df 2/13, p= .08), 
While not related to subjects’ conditioning, 
experimenters’ birt& order appeared to be a 
significant predictor of experimenters' behav- 
ior in the experiment as defined by subjects" 
ratings. Table 4 shows the correlations be- 
tween experimenters’ birth order and a num- 
ber of behavioral variables. First-born ex- 
perimenters were generally rated as fast but 
reluctant speakers who used fewer body and 
facial movements and expressions. 
Experimenters who used more 1 and WE 
pronouns in their pretesting were rated by 
their subjects as more casual (r= .34, p 
= .01), more enthusiastic (r = .22, p = .10), 
and more pleasant-voiced (r = .22, p = .10). 
Experimenters scoring higher in need for 
social approval were rated by their subjects 
as less personal (r= —.32, p= .02), less 
loud (r—.27, p=.05), less talkative (r 
= 22, p = .10), more enthusiastic (r = .27, 
p.05), but less well-liked (r= —.22, f 
— 10). None of the subjects’ ratings of 
experimenters’ behavior during the experi- 
ment showed a correlation with experimenter 
anxiety which was significant at the .05 level. 


Subject Characteristics 


Subjects! birth order, anxiety, and need for 
social approval were found to be unrelated to 
subjects’ conditioning scores. Subjects high 
in need for social approval were found to be 
later born more often than first born (7 
—.24, p= 08) and, interestingly enough, 


TABLE 4 


RELATIONSHIPS BETWEEN EXPERIMENTERS' BEING 
First Born AND RATINGS BY THEIR SUBJECTS 


Variable Y ? 
Less talkative 37 .006 
Fast speaking .32 .02 
Body use —.32 .02 
Trunk use —.27 05 
Hand gestures —.26 .05 


Expressive face 
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to have arrived earlier at the site of the 
experiment (r = .40, p = .003). 

Subjects’ operant levels of 1 and wE re- 
sponses were not significantly related to their 
conditioning scores (r= —.10). Subjects’ 
operant levels, however, tended to be a func- 
tiog of experimenters’ treatment conditions. 
Those experimenters for whom data desira- 
bility and expectancy opetated conjointly ob- 
tained higher rates of operant level respond- 
ing from their subjects (F = 3.22, df = 1/15, 
p= 10). Thus while congruence of data de- 
sirability and experimenter expectancy was 
associated both with, high operant levels and 
high conditioning scores, it is clear that the* 
conditioning scores cannot be attributed to 
the operant levels, 


Model Bias . 


The extent to which a given experimenter’s 
own performance of a task determines his 
subjects’ performance of the same task is the 
extent to which the experimenter “models” 
his subjects. A recent summary of experi- 
ments testing the hypothesis of modeling ef- 
fects suggested*that in different experiments 
there might be different orders of magnitude 
of experimenter modeling effects (Rosenthal, 
1964). 

In tpe present study, modeling effects were 
defined by the correlation between experi- 
menters’ own operant levels of 1 and we and 
the mean operant levels of their subsequently 
run subjects, 

Table 5 shows these correlations for each 
of the four experimental conditions. Two of 
the four correlations were significant at the 
.05 level, and the four 7’s were significantly 
different from one another (x?= 17.68, dj 
= 3, p < .001). It appears then, that whether 
the experimenter expectse and/or desires a 
certain outcome may significantly affect the 
direction and magnitude of experimenter 
modeling effects, 


Qualitative Analysis of Awareness Question- 
naires 


Most of the subjects (88%) felt the pur- 
pose of the experiment was to assess their 
personality, Since the subjects had filled out 
personality questionnaires, this ascription of 
purpose was natural enough, Many of the 
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TABLE 5 


CoRRELATION BETWEEN EXPERIMENTERS’ AND 
SUBJECTS’ OPERANT LEVELS FOR FOUR 
EXPERIMENTAL CONDITIONS 


Expectancy 
Outcome desirability 
Expect Don't expect 
Desirable —.88* +.997** 
Undesirable +.03 —.74 


* p «.05, two-tailed. 

** p <.005, two-tailed. 
subjects, however, saw the “test of verbal 
facility" as a personality test akin to word- 
association or sentence-completion techniques. 
More specifically, several subjects saw it as 
a test of their egocentricity, as measured by 
the’ frequency of their use of 1. 

Only 30% of all subjects believed their 
experimenters when they told them their 
verbal abilities were being assessed. Among 
subjects run by experimenters in the con- 
gruent conditions, only 18% believed their 
abilities were being assessed. In the incon- 
gruent conditions, 46% of the subjects be- 
lieved their abilities were being assessed. 
The differences in belief rates were signifi- 
cant (x? = 5.68, p = .02) suggesting that the 
behavior of the experimenters in the con- 
gruent conditions made it seem more unlikely 
to their subjects that their verbal abilities 
were being assessed. 

It has been suggested elsewhere (Rosenthal 
et al., 1963) that subjects may be interested 
in their experimenters as people rather than 
simply as “scientists.” Evidence of a “trans- 
ference reaction” was presented, In the pres- 
ent study, 20% of all subjects made some 
reference to one or more physical characteris- 
tics of their experimenter which were irrele- 
vant to the experimenter’s role performance. 
These included mention of the experimenter’s 
posture, clothing, facial blemishes, wearing 
of glasses, condition of teeth, and relative 
attractiveness. 


Discussion 


The results of the present experiment were 
both unequivocal and surprising, and their in- 
terpretation can at best be only tentative. 
This was the first experiment in our research 
program in which the experimenters’ expec- 
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tancies were varied independently of the de- 
sirabilities of the outcomes. In most of the 
earlier research experimenters who expected a 
given outcome probably also desired it while 
those who did not expect that outcome also 
did not desire it. There have, however, been 
a few studies in which it could be argued that 
all experimenters desired a given outcome 
while differing in their expectancy of it. Such 
would be the case in experiments employing 
animal subjects in which all experimenters 
wanted their subjects to perform well since 
their course grades might depend on it (Ro- 
senthal, 1964). Those experimenters who ex- 
pected better performance from their subjects 
obtained better performance than did those 
who expected poorer performance. For these 
experiments it could be argued from the te- 
sults of the present study that if there had 
been a group of experimenters who neither 
expected nor desired good performance from 
their subjects they would have obtained per- 
formance as good as that obtained by ex- 
perimenters who both wanted and expected 
good performance. Only another experiment 
can answer this question for us. 

But, from one point of view, the present 
study seems to contradict the bulk of the ear- 
lier research in which opposite expectancies 
(coupled with presumably congruent motives) 
were associated with correspondingly opposite 
results (Rosenthal, 1964). In the present 
study, on the other hand, opposite expectan- 
cies combined with congruent motives pro- 
duced identical results. The present study dif- 
fered from earlier ones in a number of ways 
any of which alone or in interaction with one 
another could account, even if not simply, for 
the differences. The present study was the 
first to: employ expectancies about verbal 
conditioning performances, employ as experi- 
menters graduate students in the physical 
sciences, and create expectancies about verbal 
behavior in which experimenters were ex- 
plicitly taught how such behavior could be 
intentionally manipulated thereby confound- 
ing the unintentional biasing process with the 
intentional reinforcement process. 

Perhaps the simplest tentative explanation 
is based on a reexamination of the phenome- 
nology of the experimenters in the various ex- 
perimental conditions. Those experimenters 


who both expected and wanted conditioning 
or neither expected nor wanted conditioning 
were told by us essentially that we thought 
they were particularly clever in the one case 
and that they had minds of their own in the 
other. Thus the congruent experimenters were 
complimented by the investigators. On the 
other hand, experimenters in the incongruent 
conditions were tõld essentially that we 
thought them to be either not too bright or 
like putty in the hands of manipulators. The 
experimenters in the incongruent condition 
then were anything but complimented by their 
employers. These experimenters could have 
‘been emotionally affected to the point where 
their verbal “reinforcements” lacked sufficient 
conviction to be positive réinforcers for their 
subjects. Experimenters in the noncongruent 
condition were in fact rated by their subjects 
as less expressive-voiced than experimenters 
in the congruent condition (p = .08) and ex- 
pressiveness of voice was positively corre- 
lated with successful conditioning (r = + .24, 
p= 08). š 

If the interpretation offered to account for 
our surprising results is cortect, then the 
present experiment in no way contradicts ear- 
lier findings, although the relation between 
the two sets of results requires further clari- 
fication. In the bulk of the previous-work, 
affect was not experimentally manipulated, 
while in this study we must conclude that the 
experimenter's affect or mood is a more im- 
portant determinant of his effectiveness as a 
reinforcer than either his expectancy or the 
desirability of the outcome in studies of verbal 
conditioning. 

Among those subjects cotta some indica- 
tion of awareness, more clear awareness was 
shown by subjects whose experimenters had 
had been led to expect no conditioning. It 
seems possible, therefore, that some of the 
ambiguity surrounding the question of aware- 
ness rates in studies of verbal conditioning 
may be associated with the experimenter’s 
expectancy regarding subjects’ conditionability 
as well as his expectancy about subsequent 
awareness (Rosenthal et al., 1963). 

Studies by Crowne and Strickland (1961) 
and by Marlowe (1962) found that subjects 
with a greater need for social approval showed 
greater verbal conditioning effects. The pres- 
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ent study, like that by Spielberger, Berger, 
and Howard (1963) found no such relation- 
ship. However, those of our subjects with a 
greater need for approval showed significantly 
less awareness of the “response-reinforcement” 
contingency. Quite possibly these subjects rec- 
ognized tlfat the spgially desirable thing to 
do when a psychological investigator inquires 
after awareness in a conditioning experiment 
is to “not see threugh” the experimental situ- 
ation. This interpretation is quite consistent 
with the position that the need for approval is 
a tendency to respond appropriately to per- 
ceived situational demands (Crowne & Mar- 
lowe, 1964). Furthet construct validation of» 
the Marlowe-Crowne scale comes from our 
finding that subjetts higher in need for ap- 
proval were more likely to arrive earlier at 
tRe site of the experiment than subjects lower 
in need for approval. However, why subjects 
higher in need for approval should more often 
be later born than first born is by no means 
obvious. 

The finding that,subjects run later by a 
given experimenter were more likely to be 
aware is most *parsimoniously interpreted as 
due to later-run subjects having a lower need 
for social approval. The finding that subjects 
fun in the second experimental session were 
more jkely to be aware is less clearly ex- 
plained. One likely interpretation involves the 
possibility of feedback from Session I subjects 
to Session II subjects. This is not a trivial 
problem. We may wonder about the effects of 
feedback from earlier- to later-run subjects in 
a good deal of behavioral research. Needed are 
some hard data on the efficacy of the optimis- 
tically solicited loyalty oath wherein we swear 
our earlier-run subjects to secrecy “until the 
experiment is over.” 

Asking subjects to deseribe their experi- 
menter’s behavior, during the experiment 
seems to be a useful technique. On the basis 
of these descriptions we were able to differ- 
entiate experimenters under the various experi- 
mental conditions. These descriptions suggest 
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how the preexperimental manipulations of ex- 
perimenter variables (as well as the experi- 
menter’s more enduring personal characteris- 
tics) might be translated into unprogramed 
experimenter behavior during the experiment. 
Our data suggest that it is this unprogramed 
behavior which is responsible for the experi- 
menter’s unintentional effect upon the results 
of his experiment. 
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USE OF TWO DIFFERENT RESPONSE MODES AND 
REPEATED TESTINGS TO PREDICT 
SOCIAL CONFORMITY * 


LEWIS R. GOLDBERG 
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The usefulness of an economical method for measuring social ¢onformity was 
explored. The effects upon yielding of (a) accurate as opposed to inaecurate 
information and (b) instructions to remember and reproduce previous responses 
as opposed to instructions to ignore previous responses were studied. In ad- 
dition, this study explored the predictability of individual differences in 
yielding through the use of a personality inventory, examining the effects of 
repeated administrations of the inventory as well as the effects of having Ss 
predict the responses of others as opposed to having Ss personally réspond to 
the items. Results suggested that this approach to the measurement of social 
conformity is fruitful, although individual differences in yielding were poorly 
predicted by the personality inventory under any of the experimental conditions. 


This methodological study of the prediction 
of social conformity was designed so that a 
number of previously unrelated hypotheses 
could be tested simultaneously. 


Measurement of Social Conformity 


Many social critics have commented on the 
apparent tendency of individuals in con- 
temporary society to conform to group 
opinion, One result of this social commentary 
has been the attempt to devise techniques 
whereby conforming behavior could be 
studied in the laboratory. Asch (1952), using 
an experimental procedure in which one sub- 
ject and eight stooges made judgments con- 
cerning the comparative lengths of lines, was 
able to show that individuals faced with a 
unanimous consensus would often disregard 
the evidence of their own visual sensations in 
order to conform to the group judgment. 
Crutchfield (1955) mechanized Asch’s pro- 
cedure so that no stooges were required and 
a number of subjects could be tested at one 
session. In addition to perceptual judgments, 
Crutchfield included both statements of 


1The study was supported by National Science 
Foundation Grant G-25123. Data analyses were 
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opinion and unsolvable problems among the 
stimuli to which he had his subjects respond; | 
He was able to show that the more ambiguous. 
the stimulus, the greater the tendency of 
subjects to conform to group opinion. d 

Tuddenham (e.g., 1958a, 1958b), in an in 
tegrated series of yielding and conformity 
studies, replicated much of tlie earlier work, 
and, in addition, compared the effects pro- 
duced by the provision of accurate as ope. 
posed to inaccurate information. He found 
that whereas inaccurate information inereasei 
judgmental disparity, accurate information 
induced convergence. Fisher and Lubin. 
(1958) investigated the effect of distanc 
between the subject's position and the spuri-' 
ous report upon two measures of influence: 
movement, the amount which the subject. 
changed from his original position, and con 
formity, the extent to which the subject 
yielded to the other's position. While these 
two indices produce identical results when the 
initial distance is the same for all subjects, 
Fisher and Lubin demonstrated that as dis- | 
tance increases, movement tends to increase 
while conformity tends to decrease. 

While the techniques developed by Asch ` 
and Crutchfield and utilized by Tuddenham, | 
Fisher and Lubin, and others are ingenious 
and produce dramatic results, they are also 
expensive and time consuming. Much more 
practical as a mass testing technique is one 
described more than 40 years ago by Moore 
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(1921)—an extension of the original method 
of Münsterberg (1914). The procedure, uti- 
lized more recently by Hastorf and Piper 
(1951), requires only that a group of indi- 
viduals respond to a questionnaire on more 
than one occasion, The first occasion serves 
to obtain data concerning the subject's un- 
biased answers to the questions, The second 
administration serves to estimate the extent 
to which the subjeet will change his responses 
when presented with either correct or incor- 
rect information concerning the manner in 
which the rest of the group responded to the 
first administration of the questionnaire, 
Hastorf and Piper demonstrated that this 
technique produced substantial conformity to 
purported group opinion even when the sub- 
jects were instructed to try to remember their 
rejfonses to the first administration and give 
the exact same response on the second ad- 
ministration (ie. even though respondents 
were instructed to. remember and reproduce 
their responses from one administration of a 
questionnaire to another, they still were in- 
fluenced by data pertaining to the responses 
of others). ° 

With the aim of further establishing this 
technique of measuring social conformity, the 
present study attempted to replicate and ex- 
tend the work of Hastorf and Piper (1951) 
and Tuddenham (1958a, 1958b) by system- 
atically manipulating: (a) the presentation of 
accurate as opposed to inaccurate information 
and (5) instructions to remember previous re- 
sponses as opposed to instructions to resbond 
without regard for previous responses. In 


` addition, the, two different. indices of con- 


formity proposed by Fisher and Lubin 
(1958) were operationalized and compared. 


Search for Nonfakable Inventory Items 


Psychologists attempting. to improve the 
validity of personality inventories have con- 
sistently been frustrated in their search for 
items that are sufficiently subtle that they 
are not fakable. Though the forced-choice 
inventory appeared to promise a solution to 
this problem, recent experimental evidence 
indicates that the forced-choice format 
neither solves the problem of faking (e.g., 
Dicken, 1959) nor increases predictive valid- 
ity beyond that of more simple formats (e.g., 
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Heilbrun, 1962; Scott, 1963). The problem 
of faking stems from the fact that some sub- 
jects may try to present a favorable impres- 
sion of themselves while others try to respond 
more honestly, One possible solution to the 
problem is to change the basic task of the 
subject from that of reporting his own in- 
ternal states to that of assessing as accurately 
as possible the beliefs or attitudes of others— 
a task supposedly not amenable to faking 
(e.g., Campbell, 1950). Instead of asking a 
respondent to evaluate an item as true or 
false of himself, one might ask him to indi- 
cate whether most people would answer true 


-or false, or whether some selected group of 


subjects (e.g. successful executives) would 
endorse the item or not (Gordon, 1949), 
Messick (1960); on the basis of a factor- 
analytic study of social desirability ratings, 
suggested that the task of rating the social 
desirability of personality test items might be 
used to extract personological information 
(ie. that knowledge of how an individual 
sees people in the world around him might 
provide important data relevant to the pre- 
diction of that individual's behavior). 

Goldberg (1962) tested Messick's hypothe- 
sis in a relatively informal manner, Thirteen 
students in an undergraduate course in per- 
sonality assessment carried out individual re- 
search projects in which they pitted the typi- 
cal personal endorsement procedure against 
social desirability ratings as potential indi- 
cators of personality trait variance, Results 
indicated that in none of the 13 studies did 
social desirability ratings do a better job than 
the traditional endorsement procedure in dif- 
ferentiating criterion groups, though in 5 of 
the studies they fared about as well. Conse- 
quently, these 5 studies were later cross- 
validated. In all cases the shrinkage upon 
cross-validation was greater for the social 
desirability rating procedure than for the 
typical endorsement procedure, 

Jackson: (1961, 1964) used desirability 
ratings of 45 inventory items to predict social 
conformity. His instructions, similar to those 
used by Edwards (1957), asked the subjects 
to judge on a 9-point scale the desirability 
of a true response to a statement when the 
statement is applied to other people. Jackson 
found that the desirability judgments had a 
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slightly higher relationship with his criterion 
(.29) than did the personal endorsement re- 
sponses (.22), and suggested that in faking 
situations the disparity in favor of desira- 
bility judgments might be even greater. 
However, when Loomis and Spilka (1963) 
replicated Jackson’s study, they found that 
desirability judgments correlated —.15 with 
the criterion, although the validity of the 
endorsement responses (.26) was about the 
same as that found by Jackson. The present 
study was designed as a modification of those 
by Jackson and Loomis and Spilka, intro- 
ducing a number of control conditions absent 
from these previous studies. The hypothesis 
tested was that behavioral criteria can be 
predicted from predictions of the. responses 
of others as well as from the more traditional 
personal endorsement responses. 


Effects of Repeated Testings 


On the basis of research carried out by 
Fiske (1957a, 1957b) and Mitra and Fiske 
(1956), Howard (1964) has hypothesized 
that, as a personality inventory is adminis- 
tered to a group repeatedly, interpersonal 
variability increases, while intrapersonal vari- 
ability decreases (ie. that the responses 
made by an individual become more con- 
sistent and more differentiating). A series of 
related studies tends to substantiate the pro- 
posal (e.g, Howard & Diesenhaus,. 1965). 
Since response instability and interindividual 
similarity may both act as constraints upon 
predictive validity (Goldberg, 1963), it fol- 
lows that repeated testing should increase 
validity by simultaneously maximizing both 
intraindividual response stability and inter- 
individual variability. The present study 
undertook a direct test of the hypothesis that 
later, as opposed to earlier, administrations 
of a personality inventory would yield higher 
indices of predictive validity. 

"Traditional psychometric theory would hold 
that any one administration of an inventory 
provides but one estimate of the respondent’s 
true score on that inventory. Successively 
obtained scores will not be equal, but should 
be distributed around the respondent’s true 
score, which, if it were known, would provide 
the highest validity coefficients, From this it 


follows that the more accurately the true 
score can be estimated, the more accurate 
should be the prediction. Since the variance 
of the mean of the scores from repeated ad- 
ministrations is less than that of the indi- 
vidual scores, the mean provides a more accu- 
rate estimate of the ,true score than does 
any one of the individual scores. Therefore, 
the validity coefficients based on the mean 
would be expected to be.higher than those 
for any one administration, If the hypotheses 
of Howard are correct, however, it may be 
that earlier administrations are simply adding 
error variance, so that predictions based on 
later administrations would be more accurate 
than those based on the mean, The hypothesis 
that predictions based om the mean are more 
accurate than those based on any single ad- 
mipistration was among those tested in the 
present study. 

In summary, the present study sought to 
ascertain further the usefulness of an eco- 
nomical method of measuring social conform- 
ity. In order to do so, it explored the effects 
upon two indices of Social conformity of 
accurate as opposed to inaccurate information 
and compared instructions to remember and 
reproduce previous responses with instruc- 
tions to ignore previous responses. In addi- 
tion, in an attempt to improve the prediction 
of social conformity from personality inven- 
tories, the study investigated the hypothesis 
that conformity can be predicted as accu- 
rately from subjects’ predictions of others’ 
responses to inventory items as from the tra- 
ditional endorsements of the same items. A 
final aim of this study was to investigate the 
effects of repeated testings upon predictive 
validity, discovering whether later adminis- 
trations of an inventory are more valid than 
earlier administrations on the one hand and 
the mean of all repeated administrations on 
the other. 


MzrHoD 
Subjects 


From a freshman dormitory at the University of 
Oregon, 198 coeds volunteered to participate for pay 
in this study, and 157 completed all of the experi- 
mental procedures. The mean age oí the subject 
group was 18.0 years, with a standard deviation of 
4 year. 
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Instruments 


Opinion Questionnaire. The criterion of social con- 
formity was obtained by means of a double ad- 
ministration of the Opinion Questionnaire previously 
used by Hastorf and Piper (1951), Jackson (1964), 
and Loomis and Spilka (1963). The 45 questions 
included 15 dealing with economic policies (e.g, 
“During a perfod of shogtage the government should 
subsidize public housing.”), 15 covering educational 
policies (e.g., “Teachers should pe hired and assigned 
to their positions by educational specialists rather 
than by local schoof boards.”), and 15 covering 
more general personal values (e.g., “There are cases 
where pre-marital sexual relations are justified.”). 
The subjects were gsKed to indicate the amount of 
their agreement or disagreement with each state- 
ment on a scale ranging, from 1 (complete agree- 
ment), through 5 (uncertainty), to 9 (complete 
disagreement), 

For the second administration of the Opinion 
Questionnaire, each of the 45 questions was fol- 
lowgd by a number described as the average response 
given by the subjects ‘on the first administratione of 
the Questionnaire. For the 25 questions with the 
smallest dispersions of group ratings on the first 
administration, the reported mean value was ob- 
tained by shifting the actual mean 3 points toward 
whichever end of the scale was most distant. The 
mean values of the 20 questions with the largest 
dispersions were reported accurately. This procedure 
was essentially thegsame as that used by Jackson 
(1964) and Loomis and Spilka (1963). It is im- 
portant to realize that the accurate versus spurious 
distinction is confounded with differences in 
original item dispersions, a confounding also common 
to the previous studies. 

Two diferent sets of instructions were utilized 
for the second administration of the Opinion Ques- 
tionnaire, Approximately one-half of the subjects 
were asked to try to remember the answers they 
had given to the Questionnaire on its first administra- 
tion and to give exactly the same response on the 
second occasion. These memory instructions were 
- virtually identical to those used by previous in- 

"vestigators. The other half of the subjects were 
asked to fill out the Questionnaire “as you feel today, 
as if you had never filled out the Questionnaire be- 
fore" (current instructions). Both the memory and 
current groups filled out the Questionnaire using the 
same 9-point rating scale used previously. A copy of 
the Opinion Questionnaire and a table listing the 
means and standard Geviations of each item on 
both administrations (as well as the means re- 
ported to the subjects on the second administration) 
has been deposited with the American Documentation 
Institute.? 


? This additional material may be obtained without 
charge from the authors at Oregon, Research In- 
stitute, P. O. Box 5173, Eugene, Oregon 97403, or 
from the American Documentation Institute. Order 
Document No. 8564 from ADI Auxiliary Publica- 
tions Project, Photoduplication Service, Library of 


Predictor Inventory. Jackson (1964) culled from 
two previous studies 45 true-false items which had 
successfully predicted social conformity. The first 22 
items were taken from Barron (1953). Of 84 items 
rationally appearing to measure attributes related 
to experimental conformity, these 22 actually dif- 
ferentiated “yielders” from “independents.” Barron 
described his procedure as follows: 


Most of these items were written anew, but others 
were culled from such sources as Murray’s “Ex- 
plorations in Personality,” the E, F, and PeC 
scales of the California Public Opinion Study, 
and scales developed at the Institute of Person- 
ality Assessment and Research to measure such 
variables as Originality and Personal Soundness 
[p. 293]. 


The last 23 items of the Predictor Inventory were 
those MMPI and CPI items which differentiated 
yielders from independents in a study by Crutch- 
field (1955). A copy of the Predictor Inventory has 
also Been deposited (see Footnote 2). 

In the present experiment the Predictor Inventory 
was administered with two different sets of in- 
structions. Approximately half the group received 
the following instructions: 


This is a test of your ability to understand 
other persons. It tests how well you can predict 
the attitudes and values of those who live around 
you, specifically University of Oregon freshman 
women. 

The inventory consists of 45 statements. Read 
each statement and decide whether others would 
consider a true or false answer characteristic of 
themselves. You are not asked whether the state- 
ment is true or false as applied to you; rather 
you are asked to decide which answer you think 
most other University of Oregon freshman women 
would endorse. 


These instructions will be referred to as prediction 
instructions. The second group of subjects was asked 
to respond to each item of the Predictor Inventory 
as it applied to themselves. These traditional in- 
structions will be referred to as endorsement instruc- 
tions, 


Procedure 


The subjects in this study were tested 1 night 
each week, for approximately 1 hour each evening, 
for 5 consecutive weeks. They completed the first 
administration of the Opinion Questionnaire during 
the first testing session and the second administra- 
tion during the fifth session. During each of the 
first four sessions each subject received either the 
endorsement or the prediction instructions for the 
Predictor Inventory. Half of the endorsement and 
half of the prediction group received the second 


Congress, Washington, D. C. 20540. Remit in advance 
$1.75 for microfilm or $2.50 for photocopies and 
make checks payable to: Chief, Photoduplication 
Service, Library of Congress. 
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administration of the Opinion Questionnaire under 
memory instructions; the other half of each group 
received current instructions. Consequently, four 
experimental groups can be compared: endorsement- 
memory (N =34), prediction-memory (N — 35), 
endorsement-current (N = 36), and prediction-current 
(N — 32). 
RESULTS 
Criterion Indices 


From the double administration of the 
Opinion Questionnaire, two yielding indices 
were calculated for each subject. These cri- 
terion indices, described in relation to the 
means reported at Administration II, were: 
yielding (Y): The subject’s average distance 
from the mean at Administration I minus her 
average distance from the mean at Adminis- 
tration II; and yielding, corrected for distance 
from the mean (Y/D): Yielding, divided by 
the subject’s distance from the mean at Ad- 
ministration I. This latter statistic estimates 
the extent to which the subject yielded in 
relation to her opportunity to yield. An indi- 
vidual whose response on the first administra- 
tion fell at the mean obviously would be un- 
able to obtain a positive yielding score. On the 
other hand, an individual whose initial re- 
sponse deviated widely from the mean re- 
sponse might be expected to show some tend- 
ency to regress towards the mean. This cor- 
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rected yielding statistic attempted to equate 
subjects in terms of their opportunity to yield. 
In the terminology of Fisher and Lubin 
(1958), Y is a measure of movement, while 
Y/D is a measure of conformity. 

Each statistic was computed for the 25 
items for which spurious mean values were 
reported, for the 20 items for which correct 
means were reported, and for all 45 items. 
The means, standard deviations, reliability 
coefficients, and intercorrelations of the cri- 
terion indices are presented in Table 1 for 
both the memory and current groups. In each 
case the figures in Table 1 were obtained by 
averaging the results for the prediction and 
endorsement groups. The corrected split-half 
reliability coefficients for the combined groups 
are listed in the diagonals. The column and 
row labeled rp present the correlations be- 
tween the yielding measures and the initial 
distance, D, of the subject from the reported 
mean. 

As the diagonal entries in Table 1 indicate, 
the corrected split-half reliability estimates 
for the various yielding indices were substan- 
tial, though considerably less than unity. In 
general, Y/D had slightly higher reliability 
estimates than Y. The highest reliability co- 
efficients were for Y/D over all 45 items, and 
the lowest were for Y over the 20 items with 


TABLE 1 


JU CAU Means, STANDARD DEVIATIONS, AND RELIABILITY COEFFICIENTS 
FoR SIX CRITERION INDICES FROM THE OPINION QUESTIONNAIRE 


M I B rA 
Group Y Y/D 
M [4 Tp 
Spurious | Correct Total Spurious | Correct Total 
Current 
Y * 
Spurious 398 55 .93 97 60 92 98 . 70 48* 
Correct A8 60" 82 57 96 79 .63 49 43 
ye 94 17 758 92 83 98 82 53 -19* 
Spurious 97 49 92 E .62 95 29 19 01 
Correct 53 96 18 56 .63* 84 27 20 —.05 
Total 91 ED 97 94 .80 .82^ 28 18 | —.02 
Memory 
M 93 2 83 27 30 28 
à 63 52 .52 49 20 18 
Tp 05 10 08 |—16 | —.03 —.09 


Note.—All correlations between indices are significantly greater than 0 (p <.01). 
Spearman-Brown 


^ Mean split-half reliability coefficient corrected by 
*  <.05; no other rp correlations are significant. 


Brown formula. 
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correct means. The reliability coefficients 
based on the items given distorted means were 
slightly higher than those based on items 
given the correct means, but the reliability 
coefficients appeared to be unaffected by dif- 
p in instructions. 

ecause thtre were, „by definition, more 
Ne uals close to the.correct means than 
to the spurious means, thesaverage amount 
of yielding (V) to items for which the cor- 
rect means were reported was, as expected, 
less than that to the items for which spuri- 
ous means were listed. However, the Y/D 
values, which take this distance into account, 


' were virtually identical’ for the two sets of 


items. The mean values of both Y and Y/D 
were similar for the current and memory 
groups, thus corroborating Hastorf and 
Pipef’s (1951) finding that directions to try 
to remember previous responses do not appre- 
ciably affect the total amount of yielding. 
The mean scores show that in all groups the 
subjects did tend to move in the direction of 
greater conformity with group opinion; in 
fact, they yielded from .67 to .99 point on a 
9-point scale, or, as shown by the Y/D 


values, about 30% of the distance that they 


could have yielded if they had changed all 
of their answers to conform to the reported 
means exactly. 

As Table 1 indicates, the intercorrelations 
among the yielding indices were all signifi- 


'. cantly different from 0 (p < .01). The cor- 


relations between the indices based upon 
spurious means and those based upon correct 


.means were .55 and .48 for Y and .62 and 


56 for Y/D;,when these coefficients were 
corrected for attenuation (using the split-half 
estimates in Table 1), they were all in the 
-90s. Overall, the correlations between Y and 
Y/D ranged from .94 to .98, the virtual iden- 
tity of these two indices most likely stemming 
from the fact that thé subjects did not differ 
to any great extent in their initial distances 
from the composite reported means. The rp 
correlations, ranging from —.05 to .19, also 
reflect the lack of individual differences in 
initial distance. 

In summary, then, this instrument did pick 
up a marked tendency toward group conform- 
ity, and it measured yielding with respectable 
reliability. There were no significant differ- 
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ences in yielding as a function of either 
memory as opposed to current instructions, or 
accurate as opposed to inaccurate information. 


Predictor Inventory 


For each subject, the following scales (all 
keyed for independence) were scored for each 
administration of the inventory: 


(All) All items from the Predictor Inventory (N 
= 45) 

(B) All items from Barron (N = 22) 

(C) All items from Crutchfield (N — 23) 

(J) Jackson replication items (N — 11). These 
items are those 11 items from Jackson (1964) which 
differentiated yielders from independents when ad- 
ministered under desirability rafing procedures. They 
can be compared with the following four sets of 
11 items which differentiated similar groups under 
the more traditional efidorsement procedures. 

(Ba) Barron half-scale a (N = 11) 

(Bb) Barron half-scale b (N = 11) 

(Ca) Crutchfield half-scale a (N— 11) 

(Cb) Crutchfield half-scale b (N = 11) 


Scores on the following eight subscales allowed 
analyses of the effects of item keying: 


(T) All true items (N = 14) 

(Bt) Barron true items (N — 8) 

(Ct) Crutchfield true items (N = 6) 
(Jt) Jackson true items (N = 5) 

(F) All false items (W = 31) 

(Bf) Barron false items (N = 14) 
(Cf) Crutchfield false items (N — 17) 
(Jf) Jackson false items (N — 6) 


The means, standard deviations, reliability 
coefficients, and intercorrelations among these 
scales for Administration I have been depos- 
ited (see Footnote 2). The average correla- 
tion between the Barron and Crutchfield 
scales was substantial (.53) for the endorse- 
ment groups, but nonsignificant for the pre- 
diction groups (.10). Most of the true and 
false scales were not interrelated for any of 
the four groups; at times, in fact, they cor- 
related negatively. In marked contrast, the 
odd-even  split-half reliability coefficients 
averaged .66 for the endorsement groups and 
.46 for the prediction groups. This wide range 
of coefficients for the three full-scale splits 
(Barron-Crutchfield, true-false, and odd- 
even) provides a vivid illustration of the 
effect item selection can have on split-half 
reliability estimates. 

The split-half reliability coefficients for B 
and C averaged .45 for all four groups, For 
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the endorsement groups J appeared to be as 
reliable as B and C, but for the prediction 
groups the reliability of J was near 0. In gen- 
eral, split-half reliability estimates were 
higher for the endorsement groups than for 
the prediction groups. Moreover, the standard 
deviations of the scales were slightly lower 
for the prediction groups than for the endorse- 
ment groups, indicating some slight tendency 
for the subjects to share a common stereo- 
type. In addition, the mean values for most 
of the scales were slightly lower for the pre- 
diction groups than for the endorsement 
groups, suggesting that the average subject 
may have ascribed a bit more originality and 
independence to herself than to her peers. 
Since the Predictor Inventory was admin- 
istered four times to. each group, separate 
test-retest reliability estimates were calcu- 
lated for Sessions 1 versus 2 and for Sessions 
3 versus 4. The test-retest reliability coeffi- 
cients for Sessions 3 versus 4 were higher 
than those for Sessions 1 versus 2 for all 
scales, confirming Howard's hypothesis that 
individuals become more consistent over 
time. Of equal interest is the fact that the 
reliability coefficients for the endorsement 
groups were substantially higher than the 
corresponding coefficients for the prediction 


TABLE 2 
VaLmiTY COEFFICIENTS FOR FIRST ADMINISTRATION 


Endorsement groups Prediction groups 
Scale 

M Ci M 

wash rss) (N = 38) es 
All —.06 —.30* 03 
B —.28* — 39% .07 
€ AT —.08 —.03 
J — 346 — 13 —.16 
Ba —.20 —.30* AL 
Bb — 129 —.35* 01 
Ca .33* 18 —.07 
Cb —07 —37* .03 
È —J3t* —.16 —.04 
Bt —.31* —.26 —.05 
Ct —21 07 .00 
Jt — 46 —.03 —.21 

.08 —.25 05 
BE = 19 —.32* 12 
Cf .28* =li —.03 
Jt —41 —.15 —.06 
Note.—Criterion = Y, based on 25 items with spurious 
DE) «05 


groups, appearing to indicate that subjects. 
were more consistent in their estimates of 
their own attributes than they were in their. 
estimates of the attributes of others. For the 
total inventory (All), the test-retest correla- 
tions for the endorsement groups averaged 
.89 and .95, for Administrations 1 versus 2 
and 3 versus 4, respectively; for the predic- 
tion groups the ,corresponding reliability co- 
efficients averaged .72 and .84. 

To test Howard’s hypothesis that subjects 
move toward uniqueness over repeated test- 
ings, indices of interindividual similarity were 
computed for each administration of the in- 
ventory. Each subject’s responses were com- 
pared with those of each of the other sub- 
jects in her group, and the average number 
of common responses was calculated. These 
indices, in turn, were then averaged for -each 
group and for each administration of the in- 
ventory. The findings indicated that for this 
set of items the subjects responded in a 
fairly differential manner to begin with, and 
no evidence of increased interindividual vari- 
ability occurred as a result of repeated test- 
ings, The average subject in the endorsement 
groups gave only 57% of her responses in 
common with others in her group on both the 
first and fourth administrations; for the pre- 
diction groups the corresponding figures were 
61% and 60% for first and last administra- 
tions, respectively, again indicating some 
slight tendency for more stereotopy among 
the prediction groups. 


Validity Coefficients as a Function of 
Different Response Modes | 


Table 2 presents, for the first administra- 
tion only, the correlations between the 16 
Predictor Inventory scales and the yielding 
criterion, Y, based on the 25 items with 
spurious means. Validity coefficients for Y/D, 
essentially the same as those for Y, have been 
deposited (see Footnote 2). Table 2 indicates 
that the inventory as a whole had little valid- 
ity, and that only for the prediction-memory 
group. (Note that, under the hypotheses of 
the study, valid coefficients are negative.) 
For the endorsement-memory group, which in 
the case of the total inventory might be con- 
sidered a direct cross-validation, the correla- 
tion coefficients were nonsignificant, For this 
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group the Barron items carried all of the valid 
weight, with validity coefficients around —.30, 
while the Crutchfield items had positive co- 
efficients; combining both sets of items de- 
stroyed the predictive validity of the inven- 
tory as a whole. For the prediction-memory 
group the Barton items were even more valid 
(—.39), and while the Crutchfield items were 
not significantly related to the criterion, they 
were in the predictededirection; consequently, 
the validity coefficients for the entire inven- 


` tory were significant for this group. More of 


the significant validity coefficients were found 
among the two memory groups than among 


"the two current groups. 


Previous investigators (Jackson, 1964; 
Loomis & Spilka, 1963) have used the same 
items, keyed in the same direction, to com- 
pare *he endorsement and desirability rating 
procedures. However, since the items in the 
Predictor Inventory were initially selected 
because of their predictive validity under en- 
dorsement conditions, there is every reason to 
suspect that such an anglysis would favor the 
endorsement groups. To test adequately the 
major hypothesis of this study it is necessary 
to cross-validate two sets of items, each set 
of which previously discriminated yielders 
from independents when it was originally ad- 
ministered, under its own set of instructions. 
Fortunately, Jackson has reported 11 items 
which differentiated at the .05 level or better 


: yielders from independents under desirability 


rating instructions. For this set of 11 items 
(J) the present study provides a cross-valida- 


tion, In order to provide comparison sets of 


items which originally differentiated under 
endorsement procedures, the Crutchfield and 
Barron items were subdivided into two odd- 
even half-scales, each composed of 11 items. 
Consequently, it was possible eto compare the 
cross-validative results for the Jackson items 
under prediction conditions with the results 
for the four half-scales from Barron and 
Crutchfield under endorsement conditions. 
These data are also presented in Table 2. 
The Jackson items, which originally differ- 
entiated under instructions similar to those 
given the prediction groups, not only failed 


* to differentiate under these conditions on 


cross-validation, but were even more valid 


than the Barron items under endorsement 
procedures! In the cross-validation of the 
Jackson items afforded by the prediction- 
memory group, three of the four subscales 
from Barron and Crutchfield performed more 
validly. These results are the exact opposite 
of those expected. 

The true subscales appeared to carry most 
of the valid weight under endorsement in- 
structions, a finding congruent with that pre- 
viously reported by Jackson. The false items, 
under endorsement instructions, appeared 
worthless. Under prediction instructions, on 
the other hand, both true and false subscales 
appeared about equally valid. It should be 
borne in mind that,-since independence-con- 
formity as measured by this inventory is a 
bipolaf trait, the keying direction is arbitrary. 
Therefore, if the scale were keyed for con- 
formity, it would be false items that were 
valid and true items that were invalid. The 
validity coefficients for each item for the first 
and fourth administrations of the Predictor 
Inventory have been deposited (see Footnote 


2). 


Effects of Repeated Testings 


Table 3 presents the predictor-criterion 
correlations for four administrations of the 
Predictor Inventory. The results, reported 
only for the correlations with Y based on the 
25 items with spurious means, show that, in 
general, repeated testings did not increase the 
validity coefficients. In fact, for the prediction- 
memory group, the originally significant 
validity coefficients vanished as a function of 
repeated testings. Validity coefficients for 
Y/D, essentially the same as those for Y, 
have been deposited (see Footnote 2). 

In the last column for each group in Table 
3 the validity coefficients for the mean scores 
over four administrations of the Predictor In- 
ventory are listed. When these coefficients 
are compared with those from single adminis- 
trations, the differences are slight. Since the 
hypothesis that the validity coefficients would 
tend to increase with repeated administra- 
tions was not confirmed, there would seem to 
be no reason for rejecting the mean score—or 
for that matter, the first score—as the best 
predictor. 
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TABLE 3 
COMPARISON OF VALIDITY COEFFICIENTS OVER REPEATED TESTINGS 
Endorsement groups Prediction groups 
Scale Administration Administration 
1 2 3 4 M 1 ERRA lees ; 4 »M 
Memory groups} 
All P! —os | —26 | —19 | —20 "o5 | —.10 04 | —.08 
B —.28* | —33* | —.29* | —.31* 19 2.07 .05 —.05 
Cc A7 | —.07 -00 —.07 —.10 —.09 00 —07 
J —.34* | —.35* | —.35* | —.33* .09 07 404 03 
bx —.31* | —.42** | —.43** | — 49** —.08 -.26 —.09 —.17 
F 08 | —.11 —.03 —04 08 01 08 EU! 
Current groups 
All —47 |—20 | —.22 | —.19 08 4 —.03 01 02 
B —.28* | —.30* | —.34* | —.32* 07 09 10 09 
Ge —406 | —.07 —.05 —.06 .05 —45 —.09 —.07 
J —49 | —.16 — AT —A7 .05 ;-.08 —.04 —.06 
Ai 7 | —.04 —.03 —.01 —.03 —.19 —.04 —.09 
F —.22 |—232 | —.25 | -23 40 04 .03 .06 


Note.—Criterion = Y, based on 25 items with spurious means. 
* 5 «.05. 


"501. 
Discussion 


The findings from this study show con- 
clusively that knowledge of the responses of 
others can have a substantial effect upon 
subjects’ responses, and that this effect oc- 
curs whether the information presented is 
accurate or inaccurate. This finding rep- 
licates similar findings by Hastorf and Piper 
(1951), Asch (1952), Barron (1953), Crutch- 
field (1955), Tuddenham (1958b), Jackson 
(1964), and Loomis and Spilka (1963). 

The measures of yielding used in this study 
appear to be of sufficient reliability to rec- 


TABLE 4 
Vauipity COEFFICIENTS FROM THREE STUDIES 


Instructions 
Study 
Endorsement | Prediction® 
Jackson (1964)> —.22* —.29** 
Loomis and Spilka (1963) —.26* 15 
Goldberg and Rorer —.06 —.30* 


(present study) 


Note.—For the total 45-item Predictor Inventory (All 
against a criterion of yielding (Y) for the 25 Opinion pond 
naire items for which spurious means were reported, with 
mea Sohini desirability rating istrictians 

* Social desirability rating ictions: Jackson (1964) and 
Loomis and Spilka (1963). c ) 

b Shifts away from reported means given value of 0 in cal- 
culating Y. 
* 5 «.05. 
**5 <01. 


ommend them for further experimental work. 
In particular, the relationship between yield- 
ing on the Opinion Questionnaire and sub- 
missive, acquiescent, or conforming behavior 
in other situations is certainly open to ques- 
tion, and some exploration of this relation- 
ship is clearly needed. 

A comparison of the present study with 
previous ones reveals some puzzling differ- 
ences. Table 4 presents the validity coeffi- 
cients from the studies by Jackson (1964) 
and Loomis and Spilka (1963), along with 
those obtained from the present study when 
the entire 45-item Predictor Inventory (All) 
was correlated with yielding (Y) (based on 
the 25 questionnaire items with spurious 
means, administered under memory instruc- 
tions). There were some procedural differ- 
ences among the three studies. Both previous 
investigators obtained 9-point desirability 
ratings on the Prediclor Inventory, rather 
than predictions of the endorsements of others 


as in the present study. Moreover, Jackson 7 


gave 0 value to shifts away from the reported 
means in computing Y. And finally, Jackson 
used a mixed group of male and female sub- 
jects, while the present study included only 
females (Loomis and Spilka did not report | 
the percentage of males and females in their 
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study). However, these differences do not ap- 
pear to explain the pattern of correlations 
reported in Table 4. 

When these results are considered in con- 
junction with the paradoxical results obtained 
when the endorsement and the prediction 
scales were cfoss-validated within the present 
study? it seems safe to conclude that, to date, 
attempts to predict conformity behavior have 
been far from satisfactory. Furthermore, the 
disappointing validity results make it impos- 

*sible to make any conclusive statements about 
the relative validity "of either endorsement as 
opposed to prediction instructions, or single 


* as opposed to repeated administrations of a 


test, In this latter regard, it should be noted 
that Howard's hypothesis of increased intra- 
individual consistency over repeated adminis- 
‘trations was confirmed; however, Howard’s 
hypothesis of increased interindividual varia- 
bility was not confirmed, probably because of 
the initial low level of commonality for this 
set of items, The derivative hypothesis that 
validity should increase over repeated ad- 
ministrations may have been unsupported 
solely because the instrument had virtually 
no initial validity. 
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AFFECT AND EXPECTATION * 
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4 groups of Ss were exposed to a probability learning situation in which they. 
guessed which of 2 stimuli would next appear. One set of stimuli contained 
angry and smiling faces, while the other or neutral set contained big and 
little kangaroos. When the input ratio was 70% angry faces to 30% smiling 
faces, Ss markedly underestimated the dominant input. When the retio was 


reversed (7096 smiling, 3096 angry), 
approximated objective input. Thus, 


expectancies for the dominant stimulus 
relative preferences for the stimuli ap- 


peared to dictate expectations, Expectancy curves for the relatively. neutral 


kangaroos fell between the curves for 
individual variability was found for 


the affective stimuli, Considerable 'inter- 
affective expectancies, These individual 


differences were tentatively associated with personality differences. ^ 


Brunswik's (1939, 1943) suggestion that 
research on learning should approximate 
everyday learning situations and Humphreys’ 
(1939) classic study on conditions of uncer- 
tain outcome have been succeeded by a large 
number of studies of probabilistic events. 
Striking regularities have been demonstrated 
for acquisition, asymptotic response level, and 
the degree to which probability response 
levels can be manipulated by instructions, 
extraneous rewards, previous reinforcement 
frequencies, stimulus asymmetries, and con- 
tingent probabilities (e.g., Brackbill, Kappy, 
& Starr, 1962; Dorwart, Ezerman, Lewis, & 
Rosenhan, 1965; Gerjuoy, Gerjuoy, & Math- 
jas, 1964; Goodnow, 1955; Grant, Hake, & 
Hornseth, 1951; Hake & Hyman, 1953; Ir- 
win, 1953, 1960; Marks, 1951; Messick & 
Solley, 1957; Siegel & Goldstein, 1959; 
Stevenson & Weir, 1959; Stevenson & Zigler, 
1958; Weir, 1962). 

Almost all of these studies have employed 
affectively neutral or nearly neutral stimuli in 
the choice situations—colored lights, blank 
and not-blank cards, etc. Such stimuli pre- 
sumably permit greater control of the experi- 
mental situation in that they are less laden 
with unknown and private associations than 
are nonneutral stimuli. At the same time, such 


1This research was supported in part by USPHS 
Grants MH-07690-01 and M-4186. We are grateful 
to Henrietta Gallagher for performing the statistical 
computations and to Herbert Gerjuoy and Nathan 
Kogan for reviewing the manuscript. An early ver- 
sion of this paper was read at the 1963 meetings of 
the American Psychological Association. 


stimuli have helped to cast the expectancy 
area into a distinctly cognitive mold. The 
probabilities to be learned and the respunse 
behavior that is demanded characterize an 
experimental situation to which the terms 
strategy, decision making, and rational game 
theory have been applied. Influences upon 
expectancy of an emotional or motivational 
nature are usually not considered in this ex- 
perimental paradigm, > 

What would happen in the learning of ex- 
pectancy if the stimuli were affective in char- 
acter? If, rather than red lights, one employed 
smiling and angry faces? The significance of 
the question derives in part from the fact 
that in “the natural environment,” as Bruns- 
wik called it, such emotional expectancies 
exist and must be learned. In part also, its 
importance resides in the interaction of ex- 
pectancy, need, and reward and in the ne- 
cessity to clarify the possible influence of 
motivation in originating and shaping certain 
expectancies and thereby, perhaps, certain 
consequent perceptions and behaviors (Mur- 
phy, 1947, 1956; Solley & Murphy, 1960). In 
affective probability learning some reinforce- 
ment value may be embedded in the meaning 
of the stimulus itself and may augment or 
reduce reinforcements imposed in the experi- 
mental procedure. 

What might be expected in affective proba- 
bility learning? First of all, we might antici- 
pate that initial response rates would vary 
depending upon the affective valence of the 
stimulus. Thus, while in the standard proba- 
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bility learning experiment initial response 
rates hover about .5, in affective probability 
learning initial rates would be above or below 
.5 depending in part upon the positive or 
negative affective value of the dominant stim- 
ulus to the individual. Such findings were, 
indeed, obtained by: Messick and Solley 
(1957) in their exploratory studies of prob- 
ability learning in children* Initial preferences 
for stimuli (happy and sad faces) were so 
strong among some subjects that their re- 
sponse rates began at or close to 1.00 and 
underwent extinttion during the subsequent 
trials. Solley and Mesgick (1957) also noted a 
tendency for adults to overguess the fre- 
quency of occurrence of “happy” as opposed 
to “sad” in the ledrning of probability rela- 
tions among combinations of attributive char- 
acteristics, 7 
Moreover, we might also expect that the 
learning curves and the asymptotic response 
levels would vary according to the affective 
valence of the dominant input. Where the 
dominant input hasea negative valence, for 
example, the acquisition curve would be 
flatter and the a8ymptotic level lower than for 


Fic, 1. Experimental and control stimuli used in 
this experiment. 
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T — 
Uè 3 4 5 6 7 8 
BLOCKS OF TRIALS 
( BLOCK 110 TRIALS, ALL OTHER BLOCKS 20 TRIALS) 


Fic. 2. Performance of subjects in the BL, SA, and 
AS conditions. 


positive valued inputs. That the positive or 
neggtive meaning of the stimulus can signifi- 
cantly affect response levels (at least over 30 
trials) has been demonstrated by Solley, 
Jackson, and Messick (1957); when various 
symbols and manipulations reflecting positive 
or negative evaluative meaning were associ- 
ated experimentally with previously neutral 
stimuli, subjects significantly overguessed the 
occurrence of stimuli associated with positive 
meaning and underguessed those associated 
with negative meaning. 

Finally we would expect that individual 
differences in expectancy for positive and 
negative affect would be related to stable per- 
sonality characteristics. Consequently, in ad- 
dition to significant constant effects indicating 
that, on the average, stimuli with positive 
value are expected to occur overly frequently, 
we will also investigate the possibility that 
certain personality dispositions may mediate 
the influence of affective value on expectan- 
cies (cf. Klein & Schlesinger, 1949). 


METHOD 
Materials 


Two sets of stimuli were employed: the experi- 
mental stimuli were angry and smiling male faces 
and the relatively neutral stimuli were big and little 
kangaroos. The faces and animals were simple line 
drawings which were multilithed onto 5X8 inch 
cards, scale photographs of which are shown in 
Figure 1. The experimental deck consisted of 150 
cards with a 70-30 input randomized in blocks of 
10. Two decks were constructed for each set of 
stimuli. One deck consisted of 7096 smiling faces (or 
big kangaroos for the neutral series) and 30% angry 
faces (or little kangaroos), and is called the SA (or 
BL) condition. The second deck consisted of 7096 
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angry faces (little kangaroos) and 30% smiling 
faces (big kangaroos), and is referred to as the AS 
(or LB) condition. 


Procedure 


The experiment was carried out with groups of 
8-13 subjects. Each subject was first given three 
booklets containing 50 pages each. The experimenter 
then displayed the stimulus cards and said (for 70% 
angry, 30% smiling conditions) : 


Here are two faces. One is angry [show] and 
the other is smiling [show]. Now, I have a deck 
of such cards [show] and I want you to guess 
which face is coming up. When I say "guess," if 
you think the face will be smiling, write an S on 
the first page of your booklet. If you think the 
face will be angry, write an A. Then turn the 
page and wait until I say “guess,” and enter your 
next guess on the following page. Use a separate 
page for each guess, but don't guess until I tell-you 
to. All right . . . Guess . . . Here [shows card]. 


Instructions were appropriately altered for the 
neutral conditions where subjects wrote B for big 
and L for little kangaroo. After each guess, the ex- 
perimeter showed the card. An intertrial interval of 
about 3 seconds effectively prevented the subjects 
from looking back over their responses. 


Subjects 


Subjects were 116 undergraduate males and fe- 
males, who were distributed among the various con- 
ditions as shown in Table 1. 


RESULTS AND DISCUSSION 


Figure 2 presents the mean proportion of 
responses to the stimulus with the 7076 input 
(S.;o) for the AS, SA, and BL conditions. 
(Data from the LB condition are not included 
since they did not differ significantly from 
BL, on the one hand, ard were obtained from 
a small N—N = 9—on the other.) A com- 
parison of the six groups, using the Kruskal- 
Wallis test (Siegel, 1956)' applied to indi- 
vidual subjects’ mean proportions of responses 
to the dominant input for the first 10, last 40, 
and entire 150 trials, indicated that these 
values were not drawn from the same popu- 
lation ($ < .01). 


Initial Response Levels 


As Table 1 shows, initial response rates 
varied from the typically obtained .5 level, 
even for BL, the “control” condition, (Note 
that response rate always refers to responses 
to the more frequent stimulus.) Thus, it 
would appear that the apparently neutral BL 
condition was not as neutral as we might have 
preferred: big and little kangaroos appear to 
have semantic associations that make subjects’ 
expectations for these stimuli depart from the 
ordinarily expected .5. Strictly speaking, then, 


TABLE 1 


COMPOSITION OF THE SAMPLES: MEANS (PERCENTAGE) AND STANDARD DEVIATIONS 
or CHOICES or DOMINANT INPUTS 


First 10 trials 


Last 40 trials Total 150 trials 


Composition and sex N 
of subjects Deviation 
M SD Írom .50 SD 

AS—70% angry; 
30% Fu 

M 22 44.09 | 19.46 10.40 

F 15 40.00 | 15.06 8.80 

43 9.86 


M+F 
SA—70% smiling; 
30! 7o angry 
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"TABLE 2 


Comparison OF INITIAL RESPONSE RATE, ASYMPTOTIC PERFORMANCE, 
AND TOTAL PERFORMANCE FOR AS, SA, AND BL CONDITIONS 
(Mann-Wuitney U Tests, Two-TaiLED) 


Trials 1-10 Trials 111-150 Trials 1-150 
Ni Na 
5 be ete z o< Z 2< £ b< 
AS versus SA 37 42 3.31 .0009 2.72 .007 4.48 00006 
BL versus SA 28 . 40 4.23 .0001 1.56 2 2.00 .05 
BL versus AS 2 28 37 1.38 46 1.12 .27 2,22 .03 


the BL data serve as comparison, rather than 
as control data.” 

For the SA condition, there was a signifi- 
cant initial departure from .5 in the direction 
of guessing more smiling than angry faces. 
One might argue that for this condition there 
was some learning of the input probabilities 
even in the first 10 trials. Such an argument 
would not hold, however, for the AS condi- 
tion where the initial expectancy was signifi- 
cantly below .5, despite the fact that angry 
faces constituted 70% of the input. A similar 
result was obtained*for the BL condition. In 
these conditions the aversiveness of the S.o 
input (or perhaps the attractiveness of the 
S.so input) overcame both the actual domi- 
nance of the S.zo stimulus in the deck, and 
ordinary .5 initial response tendencies. The 
differe&ces in initial response tendencies are 
clearly seen in Table 2 where comparison of 
expectancies between SA versus AS and SA 
versus BL yield two-tailed p values of less 
than .0009 and .0001, respectively. No signifi- 
cant differences emerged between the AS and 
BL conditions. 


Asymptotic Response Rates 


As Figure 1 indicates, there was a strong 
tendency for the differences between the three 
conditions to diminish and stabilize towards 
the end of the 150 trials. Nevertheless, mark- 
edly significant differences continued to ap- 
pear between the AS and SA conditions (Ta- 
ble 2), though neither of these conditions 
differed significantly from the BL condition 
at asymptote, The data would suggest that 
over time the subjects gradually learned. to 
respond somewhat more to the objective stim- 
ulus inputs and to become relatively less af- 
fected by the affective value of the stimuli. 


Expectancy across 150 Trials 


Mean proportions of S.7 responses were, as 
might be expected, different for the AS and 
SA conditions over all 150 trials (Tables 1 
and 2). Where the dominant input was an 
angry face, that stimulus was expected 58% 
of the time. On the other hand, where the 
70% stimulus was a smiling face, the sub- 
jects anticipated the stimulus 68% of the 
time. Indeed, as Figure 2 shows, the mean 
curves for the experimental conditions did 
not overlap, 

Differences between responses to the affec- 
tive stimuli and to the relatively neutral ones 
were also significant across the 150 trials 
(Table 2). 


Sex Differences 


We examined the data to see whether there 
were sex differences in the tendency to over- 
or underexpect the S; input. Male-female 
comparisons within each condition revealed 
no statistically significant differences (Table 
3). When, however, comparisons were made 
across conditions it was found that mean dif- 
ferences in expectancy of the S.zo input were 
far greater for females than for males (Ta- 
bles 1 and 3), Thus, over the entire 150 
trials mean expectancy for S.7o for females in 
the AS condition was 56% while expectancy 
in the SA condition was 70.4% (Table 1). 
The difference was significant beyond .0001 
(Table 3), For men, such overall significant 
differences were present but not as sharp. At 
the asymptote, women tended to overexpect 
the S.7o input when it had a positive affective 
valence and to underestimate it markedly 
when the S.;; valence was negative. For men, 
continuous exposure to the stimuli decreased 
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TABLE 3 


SEX DIFFERENCES IN EXPECTANCY FOR AS, SA, AND BL CONDITIONS 
(Mann-Wuitney U Tests, Two-TarLep) 


Trials 1-10 Trials 111-150 Trials 1-150 
Ni N: 
Z p< Z LS £z »« 
[I ————€ 

Within AS: F versus M 15 22 1.17 ns 1.04 ns E ns 
Within SA:F versusM | 20 22 .399 ns —1.04 ns 1.53 13 
AS versus SA (M) 22 22 2.15 .03 .94 ns 244 01 
AS versus SA (F) 15 20 2.67 .008 2.98 .003 3.92 .0001 


the impact of the affect to the point that, at 
asymptote, no significant differences in ex- 
pectancy between the SA and AS conditions 
were evident (Table 3). The data suggest 
that the affective stimuli were, to begin with, 
not quite as influential for men as they were 
Tor women, and that after 110 trials men were 
more able than women to resist the influence 
of the affects per se, and to respond more to 
the objective inputs. 


Individual Differences in Expectancy and 
Their Relation to Personality 

The apparent regularity of probability 
learning curves (and most other average per- 


TABLE 4 


CoRRELATIONS BETWEEN TOTAL RESPONSES FOR THE 
S.;; INPUT AND PERSONALITY CHARACTERISTICS 


AS SA 
70% angry 70% smiling 
S.70 input 
Females | Males | Females | Males 
(N =15) | (N =19) | (N =20) | (N =20) 
EPPS 

Achievement —.06 19 41 | —.51* 
Deference .62**| 00 | —.24 | —.31 
Order 40 .38* 493 | —.04 
Exhibition —.38 | —.41* | —.14 .39* 
Autonomy —.23 | —.15 .03 34 
Affiliation 05 |—02 | —.33 10 
Intraception 34 09 |—.19 | —.20 
Succorance —.25 | —.26 14 46* 
Dominance —.01 | —.16 9 | —27 
Abasement —.16 21 .00 21 
Nurturance 2t 402 | —.04 AT 
Change —.48* | —4A4 | —.13 02 
Endurance .29 3| 25 |—.31 
Heterosexuality | —.13 | —.20 S5*| 02 
Aggression —.23 |—.01 | —.14 03 


Note.—Ns vary from condition to condition. In addition, 
not all subjects took the EPPS. Thus, the Ns here are lower 
than those cited in Table 1. 

*p <.05. 

**p <.01. 


formance curves) are misleading in that these 
mean performance curyes summarize and 
conceal large individual differences.* These in- 
dividual differences lead us to inquire about 
possible relations between affective expectancy 
and personality. " 

Most of the subjects had taken the Ed- 
wards Personal Preference Schedule (EPPS) 
5 weeks before the probability learning ex- 
periment. Correlations between the 15 need 
scales of the EPPS and total responses for 
the S.7o stimulus over all:trials are presented 
in Table 4, separately for males and females 
and for the AS and SA conditions. Before 
proceeding to discuss these relations, however, 
we must issue a caveat: In addition to the 
problems of interpreting correlations for 15 
measures based upon such small sampl» sizes, 
the EPPS imposes some structural constraints 
upon the sizes of the correlation coefficients 
by virtue of its ipsative format. One of the 
properties of such an ipsative test is that the 
sum of the covariances between these 15 
scales and some other “criterion” score is 0, 
and when the ipsative scale variances are 
constant, the sum of the "criterion" correla- 
tions is also O (Radcliffe, 1963). Thus, these 
ipsative covariances may be considered to be 
covariances for the needs measured in a non- 
ipsative or normative way, with the average 
normative covariance subtracted. Since this 
average normative covariance is not known, 

?Figures 3a and 3b containing more detailed in- 
formation have been deposited with the American 
Documentation Institute. Order Document No. 8606 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton, D. C. 20540. Remit in advance $1.25 for micro- 
film or $1.25 for photocopies and make checks pay- 
able to: Chief, Photoduplication Service, Library of 
Congress. 
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it is difficult to compensate in the interpreta- 
tion of the correlations for the restriction that 
the sum of the ipsative covariances is 0. 

With this warning in mind, we will proceed 
to discuss some of the high correlations in a 
deliberately tentative fashion. There is a mod- 
est consistency acrogs sex for some of these 
correlations, but only in the AS condition. 
We note, for example, that the number of 
angry responsesin the AS condition corre- 
lates negatively with Exhibition (the need to 
occupy center stage, as it were) and positively 
with Order. The negative relationship with 
Change for females, and the positive rela- 
tionship with Endifrance for males, would 
appear to be consistent with previously re- 
ported correlations for Order (Edwards, 
1954). The positive correlation between total 
expectancy for angry affect and deference 
among females, but not among males, would 
appear to be a stimulus-linked relationship: 
Since the stimuli were male faces, this finding 
might suggest that women who expect males 
to be angry are also deferent to them. 

In the SA condition, we find positive cor- 
relations between the number of smiling faces 
guessed and both Exhibition and Succorance 
among males, Among females, there is a 
strong positive correlation with interest in the 
opposite sex (Heterosexuality). (Recall again 
that the stimulus face was male.) Finally, if 
we take a high expectancy of smiling faces to 
mean a desire for relatively immediate social 
reinforcement from the environment, and the 
need for achievement as reflecting a pref- 
erence for internal (as opposed to social and 
external) reinforcement, the negative corre- 
lation for males between the EPPS Achieve- 
ment scale and expectancy for smiling faces 
can be rationalized. 

LJ 
CONCLUSIONS 


We turn again to the differences between 
the AS and SA conditions. At the moment, it 
is clear that no definite statement can be 
made as to why these differences occur. Does 
the higher S.zo response level in the SA condi- 
tion occur because subjects prefer the smiling 
face, or because they avoid the angry? Is it 
the positive valence of the smile that dictates 
the elevated response level, or the negative 


valence of anger, or the contrast of the two? 
The available data do not permit us to decide. 
At a descriptive level, however, we can say 
that given a context in which a smiling face 
is the more probable occurrence, subjects’ 
guesses will tend to approximate input. On 
the other hand, where the context is such that 
anger is the more probable occurrence, sub- 
jects will tend to underguess the dominant 
input markedly. 

"These findings suggest that, with regard to 
emotional expectancies, subjects! responses are 
somewhat autistic, that is, their behavior 
appears to be influenced by internal determi- 
nants such as affects and motivations as well 
as by external determinants such as input 
probabilities (Helson, 1953; Murphy, 1947; 
Solley & Murphy, 1960). Although such mo- 
tivationally biased expectancies may not over- 
ride immediate stimulus conditions in many 
perceptual tasks (Solley & Long, 1958), they 
may contribute to the context of perception 
by influencing the hypothesis or “best bet" 
used to organize the stimulus information 
(Bruner, 1957; Ittelson & Cantril, 1954; 
Postman, 1951), and in extreme cases may 
set the stage for autistic perception (Murphy, 
1947; Solley & Murphy, 1960). 
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EFFECTS OF DISCREPANCIES BETWEEN OBSERVED 
AND IMPOSED REWARD CRITERIA ON THEIR 
ACQUISITION AND TRANSMISSION * 


WALTER MISCHEL ax» ROBERT M. LIEBERT 
Stanford University 


ee 
An adult (M) alternated turns with child Ss in a bowling game with experi- 
mentally controlled scores and abundantly available rewards. The treatments 
involved discrepancies between the performance criteria used by M to reward 
himself and those he imposed on S, Thereafter, Ss continued the game in M’s 
absence, with free access to rewards, To examine “role-taking” effects, 3 the Ss 
in each treatment performed alone 1st and then demonstrated the game to 
another younger child (O), with the sequence reversed for the remainder. As 
anticipated, reward schedules in the adult's absence were most stringent when 
both M and S had initially adhered to a high criterion and least when S had 
been permitted to reward himself for low achievements. Ss who were trained 
) to reward themselves only on a stringent criterion and observed M reward 
himself similarly, maintained more stringenf schedules than those who had 
been given the same stringent direct. training for self-reward but by an M who 
rewarded himself leniently, The criteria Ss imposed on O tended to be identical 
with those they imposed on themselves and role taking had only indirect effects. 


A critical aspect of self-control is the indi- 
vidual’s own self-administration and regula- 
tion of the rewardseand punishments which 
are available to him without external con- 
straints. Humans evaluate their own perform- 
ance and frequently set standards which 
determine, in part, the conditions under which 
they self-administer or withhold numerous 
readily eavailable gratifications and a multi- 
tude of self-punishments. Failure to meet 
widely varying self-imposed performance 
standards often results in self-denial or even 
harsher self-punishments whereas attainment 
of difficult criteria more typically leads to 
liberal self-reward and a variety of self- 

' congratulatory responses, Although research 
concentrating on infrahumans may find it 
easy to neglect this phenomenon, it is ap- 
parent that for humans self-administered re- 
inforcers constitute powerful incentives for 
learning and potent reinforcers for the main- 
tenance of behavior patterns. In spite of the 
importance of self-reward as a human process 
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there have been relatively few experimental 
investigations of its antecedents. 

Kanfer and Marston (1963b) provide some 
support for the direct conditioning of “self- 
reinforcing responses" and found that adults 
who were encouraged for judging their re- 
Sponses as accurate on an ambiguous non- 
contingent task increased their rate of self- 
reinforcement and rewarded themselves more 
frequently on a new learning task than those 
who were discouraged from judging their re- 
sponses as accurate. The same authors also 
found that the frequency of self-reinforcement 
is partly dependent on such variables as the 
correctness of the individual's responses and 
the degree of similarity between the training 
and generalization tasks (Kanfer, Bradley, & 
Marston, 1962; Kanfer & Marston, 1963a). 

An effective means of influencing children 
to adopt particular self-reward criteria con- 
sists of exposing them to the criteria ex- 
hibited by models. It has been demonstrated 
that mere observation of a model's self- 
reward patterns, without direct reinforcement 
to the observer, can result in their adoption 
by the observer even in the model's absence 
(Bandura & Kupers, 1964). 

In life situations reward standards usually 
are transmitted by individuals who exhibit 
their own self-reward criteria and also rein- 
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force the observer's adherence to particular 
criteria. The modeled and directly reinforced 
behaviors may not be congruent and the 
criteria used by social agents for adminis- 
tering rewards to themselves often are dis- 
crepant with the standards which they 
directly impose on others. Consider, for ex- 
ample, the father who tries to influence his 
child towards self-denial and work while he 
simultaneously and persistently indulges him- 
self. Although frequent reference is made to 
the importance of "consistency" in child- 
rearing practices, usually this refers to con- 
sistency in the use of direct training tech- 
niques across different situations and the 
effects of consistency or discrepancy between 
direct training and modeling procedures re- 
main unexplored. The presen? study therefore 
investigated the effects of discrepancies in the 
stringency of the self-reward criteria used by 
an adult and the standards he imposed on 
a child. 

Children participated with a female adult 
model in a task which seemingly required skill 
but on which scores were experimentally con- 
trolled. A plentiful supply of tokens which 
could be exchanged for rewards was available 
to both the model and the subject. In one 
experimental group the model rewarded her- 
self only for high performances but guided 
the subject to reward himself for lower 
achievements; in a second condition the 
model rewarded herself for low performances 
but led the subject to reward himself only 
for higher achievements; in the third group 
the model rewarded herself only for high per- 
formances and guided the child to reward 
himself only for equally high achievements. 
After exposure to these experimental pro- 
cedures measures were obtained of the chil- 
dren's self-reward patterns displayed in the 
model's absence. 

The following hypotheses were advanced 
concerning children's reward criteria in the 
absence of the model as a function of the 
initial standards imposed on them and dis- 
played by the model. 

It was reasoned that the reward criteria 
adopted by subjects will be a function of both 
the criteria they observed a model use for 
herself and those she imposed on them di- 


rectly. When the observed and imposed cri- 
teria are consistent they should be adopted 
most readily. Therefore, greatest stringency 
will be shown by children who were held to 
a stringent standard and also observed a 
model who was stringent with herself. These 
children should be more likely to use higher 
standards for reward than either children Who 
received the same, stringent direct training 
but observed a lenient model or those who 
were permitted leniency themselves. More- 
over, when the observed and imposed criteria 
are discrepant, the less sttingent alternative 
will be adopted. When the criterion leading 
to more reward is the oae that subjects were 
directly trained to adopt they should have 
little conflict about rewarding themselves 
generously in the model’s absence and should 
maintain the lenient criterion on which they 
were trained. In contrast, those who were 
trained to be stringent but observed a more 
lenient model should be tempted to reward 
themselves more liberally when there are no 
external constraints. In the model’s absence 
their behavior should reflect conflict about 
adopting the lower criteria yielding more fre- 
quent reward used by their own model and 
the more stringent standards which had been 
imposed on them, Therefore it was antici- 
pated that subjects would adopt lenient cri- 
teria more frequently when they had been 
permitted greater leniency themselves than 
when they observed it in another. 

The design also investigated the effects of 
the children’s role on their self-administered 
reward schedules and on the criteria they 
imposed upon others. When there is a dis- 
crepancy between the reward criteria imposed 
on the child and the standards he observed 
used by the model the criteria that the child 
adopts in the model’s absence and in the ab- 
sence of other external constraints may be 
affected by his role. First, the subject may 
more readily adopt his model’s criteria, as 
opposed to those on which he received direct 
training, when he himself becomes the model 
or demonstrator for another person than when 
he remains in the role of only a performer. 
Second, giveri that each subject is placed in 
both roles, and becomes both a demonstrator 
and a performer, the sequence in which these 
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roles occur may affect the extent to which 
he adopts the criteria displayed by his model 
or those to which he was directly trained. 
Specifically, if following the initial interac- 
tions with the model the situation is struc- 
tured so that the child immediately becomes 


` the model oredemonstrator for another person 
` (say*a younger child) he may be more likely 


to adopt the pattern displayed by his own 
model than if he is given this role after he 
has already practiced extensively as a per- 
former, If such effects occur they would indi- 
cate that role factors may be important 
determinants of the acquisition and transmis- 
sion of self-reward patterns. In contrast, if 
the criteria adopted by the child, both for 
his self-reward and for rewarding others, are 
primarily a function of his prior experience 
witb observed and imposed reward standards 
as discussed above, the effect of such role 
variables would be minimal. 

The following manipulations were used to 
test the effect of placing the child into the 
role of model or demonstrator as opposed to 
the role of performet and of varying the 
sequence of these zoles. After the child's inter- 
action with the adult model, half the children 
became “demonstrators” of the game for an- 
other younger child, alternating with him for 
a series of trials, and thereafter performed 
alone on*additional trials. Half the subjects 
participated in the reverse sequence, first per- 
forming alone and then alternating trials with 
a younger child to whom they demonstrated 
the game. Both sequences took place in the 
absence of the experimenter as well as the 


* model. 


It is apparent from the above that the 
design permitted investigation of the effects 
of the independent variables not only on the 
acquisition of self-reward griteria but also 
on the transmission of these standards by 
subjects to others (fhe younger child) when 
the subject controls the available reinforcers. 
No differences in the criteria used by the 
subject for himself and those he imposes on 
the other child were anticipated. That is, 
the same between-treatment differences pre- 
dicted for self-reward criteria wére expected 
for the transmission of these standards to 
another person, 
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METHOD 
Subjects and Experimenters 


Subjects were 54 fourth-grade children (30 boys 
and 24 girls) from two elementary schools in the 
Stanford area. One adult male was the experimenter 
for all subjects. Two adult females served as models, 
with one model used for each subject. Each model 
was employed with an equal number of children 
from each treatment condition. In the phases of 
the experiment dealing with the transmission of 
reward criteria, each experimental subject was con- 
fronted with a younger child. The younger child was 
always drawn from the second grade of the subject’s 
school and was of the same sex as the subject, A 
different second-grade child was used with each 
experimental subject. 


Summary of Design 


Eaoh subject wa% randomly assigned to one of 
three model-subject interactions. One third of the 
children observed stringent reward criteria modeled 
but were led to use lenient criteria; one third 
observed lenient reward criteria modeled but were 
led to use stringent criteria; the remainder observed 
a model who used stringent criteria for herself and 
applied the same criteria to the child. Thereafter, in 
the absence of the model, half the subjects in each 
group demonstrated the game to another younger 
child and then performed alone, whereas the other 
half went through the reverse sequence, first per- 
forming alone and then demonstrating. 


Apparatus 


The apparatus was a modification of a bowling 
game used by Bandura and Whalen (1966) and 
consisted of a miniature bowling alley with a 3 foot 
runway at the end of which there were seven signal 
lights. Each light was labeled with a score, the score 
of 5 occurring once whereas scores of 10, 15, and 20 
each appeared twice. The lights and scores were 
displayed on an upright partition facing the bowler, 
Whereas the Bandura and Whalen apparatus was 
controlled manually by the experimenter, the present 
version contained a series of concealed electronic 
relay switches which were preset for each subject 
in order to control in a standardized manner the 
entire sequence of scores for all trials, This apparatus 
permitted all trials to occur in the absence of the 
experimenter and the latter recorded all data from 
behind a one-way observation window, The target 
area was screened from the subject’s view by shields 
which covered the terminal area of the runway and 
encircled the ostensible targets so that the child 
had no knowledge of whether or not the bowling 
balls were striking the target area and was dependent 
on the electric score signals for feedback. Pretesting 
indicated that the procedure appealed to the subjects 
and no doubts were raised about its credibility. 
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Procedure 


Each subject was taken individually by the experi- 
menter from his classroom to a three-room research 
trailer located on the school premises. The experi- 
menter said he was from a toy company and that 
a new toy, something like a bowling game, was being 
tried out on children of this age group to see how 
they liked it. Upon reaching the trailer, the subject 
was introduced to the female experimental confed- 
erate who served as the model and was told that 
she would show him the game. The experimenter 
then left and observed and recorded the procedure 
from the trailer’s observation room. 

The model showed and explained the game to the 
child and demonstrated by rolling one trial, also 
indicating that both players would write down their 
scores on special score sheets at the end of each 
roll She then called áttention to a bowl of white 
chips, used as tokens, and explained that "the chips 
are worth valuable prizes at the end, and the more 
chips, the better the prize." V/rapped packages of 
toys were visible in the trailer. No other statements 
describing the reward tokens were made and each 
player was given a container for collecting his 
tokens, Tokens, rather than candy or other rewards, 
were used to avoid satiation effects during the ex- 
periment. The model and child alternated turns for 
a total of 10 trials each, the model taking the first 
turn. The scores for both model and child were 
always in a fixed 10-trial sequence and the same 
sequence was used in each series of subsequent test 
trials described below. The model’s program was 5, 
20, 10, 15, 15, 20, 20, 10, 20, 15, whereas the sub- 
ject's program was always 20, 10, 15, 20, 20, 5, 15, 
5, 10, 5. 


Discrepancies between Modeled and Imposed 
Reward Criteria 


The model-subject discrepancy treatments involved 
the following variations in the discrepancy of the 
scores for which the model rewarded herself and 
those for which she guided the subject to reward 
himself, 

In the stringent criterion modeled, lenient criterion 
imposed treatment (Ms, Sis») the model rewarded 
herself only for scores of 20 but led the subject to 
reward himself for scores of 15 or 20. Whenever the 
model’s score was 20 she took a token and made 
approving comments such as, “That’s a good score. 
That deserves a chip.” or “I can be proud of that 
score. I should treat myself for that.” In contrast, 
whenever her score was below the criterion of 20 
she refrained from taking a token and commented 
with obvious self-disapproval, "That's not a very 
good score. That doesn’t deserve a chip.” or “Well, 
I can’t be very proud of that. I can’t treat myself 
for that low score.” Using a fixed memorized script 
she addressed similar approving comments to the 
child whenever his score was either 15 or 20 and 
made parallel critical comments whenever his score 
was below 15. 


In the lenient criterion modeled, stringent criterion 
imposed condition (Miss, Sw) the model used 
same pattern of self-reward and self-disapproval 
applied the same positive and negative commen! 


positively evaluating the child's performance. 

the model's score was either 15 or 20 she expressed 
approval and helped herself to a token and when 
her scores were below 15 she expressed disapproval 
at her own performance and refrained from reward- 
ing herself, However, she showed approval of 
child's performance and commented that a chip 
deserved only on trials when the child's score was 
20, making her negative comments for all lower 
Scores. 

In the stringent criterion modeled, stringent cri- 
terion imposed condition (Ms, Sx) the model dis- 
played her self-reward pattern only for scores of 20 
and likewise commented pesitively on the child's. 
performance only when he obtained scores of 20, 
indicating for all other performances that they, did 
noi deserve a chip and showing dissatisfaction. 

Previous research (Bandura & Kupers, 1964) has 
already demonstrated that children exposed to a 
model's self-reward criteria in a similar experimental: 
situation adopt the model's criteria whereas children. 
in a no-model control group adopt essentially ran- 
domly distributed self-rewird patterns that are un- 
related to performance level. A no-model group 
therefore was not employed in the present study. 

After both model and subject completed 10 trials 
the model said she had to leave and did so, collecting 
her own chips with enthusiasm and noting they 
would be exchanged now for valuable prizes. The 
experimenter returned and spent 5 minutes with the 
child on a simple unrelated guessing game which was 
used to reduce any immediate emotional arousal 
resulting from the treatments before the next phase 
of the experiment commenced. 


Role Treatments 


After the model-subject interactions described: 
above, the following variations were used to investi- 
gate role effects. Namely, half the children in each 
model-subject discrepancy treatment were left alone 
for 10 trials to be performers (P) by themselves. 
The experimenter instructed them to bowl and help 
themselves to rewards as they pleased and then 
exited. Following these 10 trials the experimenter 
reentered with a younger child (O). He introduced 
the two children, asking the subject to demonstrate 
(D) the game to O, and left the two alone with 
each other until they had alternated turns for 10 
trials each. This was the performer-demonstrator or 
P-D sequence. With the other half of the subjects 
in each treatment group the above sequence was 
reversed. These subjects immediately demonstrated 
the game to the younger child, the two children 
taking turns for 10 trials each, and thereafter the 
subject was left alone to perform by himself for an- 
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other 10 trials (demonstrator-performer or D-P 
sequence). Thus, by comparing behavior when the 
subject is a performer and when he is a demon- 
strator basic role differences can be tested, and by 
comparing the P-D and D-P orders role sequence 
effects may be examined. 


Measures of Reward Patterns 
ee 


The dependent measures collected from subjects in 
both sequences were the scores for which self- 
reward occurred when performing alone (selí-reward 
when performer), when "demonstrating the game to 
the other child (self-reward when demonstrator), 


` As well as the scores for which the subject rewarded 


, 


, 
| 
| 
| 
| 


, 


the other child (rewards other). All dependent mea- 
sures were collected in the absence of both the model 
„and experimenter and wereerecorded by the latter 
through an observation window. 


RESULTS 


Inspection of the data in each treatment 
for males and females separately indicated nó 
trends for sex differences and male and female 
data were therefore combined for all analyses. 
More than 92% of all subjects used either 
scores of 20-only or scores of 15 or 20 as the 
reward criteria, The data therefore required a 
nonparametric test and chi-square compari- 
sons of reward for scores of 20-only as op- 
posed to scores below 20 were used in all 
contingency tables with W greater than 20.* 
Fisher exact tests were used when N was 
below 20. e 

The experimental procedures used to guide 
the subject to reward himself only for par- 
ticular contingencies in the model's presence, 

y means of the model’s verbal approval and 
disapproval, were completely effective. In all 
treatment conditions subjects without excep- 

“ tión rewarded themselves in the model’s pres- 
ence whenever she indicated that the score 
was deserving and never when she commented 
negatively on the performance. The effective- 
ness of this guidance procedufe made it pos- 
sible to use all subjects for investigating 
the effects of the experimental variables on 
behavior in the model's absence. 

Figure 1 shows the percentage of subjects 
who administered rewards for scores below 20 
in each treatment group in all model-absent 
phases of the experiment. Note that whether 
the child was a performer first or a demon- 

- strator first did not appear to affect appreci- 


2 All chi-squares were corrected for continuity. 
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Fic. 1. Self-reward when the subject is a per- 
former or a demonstrator and reward of other child, 
as a function of the performer-demonstrator (P-D) 
or demonstrator-performer (D-P) sequence and the 
initial criteria exhibited by the model and imposed 
on the subject. $ 


ably his reward criteria within each model- 
subject discrepancy treatment, Fisher exact 
test comparisons of subjects who rewarded 
for scores below 20 as opposed to scores of 
20-only in the P-D as opposed to the D-P 
sequence within each discrepancy treatment 
yielded no ? values approaching significance. 
Therefore, subsequent comparisons between 
discrepancy treatments combined subjects 
from both the D-P and P-D sequences. 


TABLE 1 


TREATMENT COMPARISONS OF SUBJECTS REWARDING 
FOR Scores OF 20-ONLv om Scores BELOW 20 
IN EACH PHASE 


Phases 
comparisons | Self-reward Sdfreward | Rewards 
(performer) | (demon | other child 
x x x 

Mao, Szo 

versus 7.8 11,2 9,48 
Misy20, Sao* 
Mayo S20 

versus 8.1 3.46% 9.48 
Moo, Sis/20" 
Mino, Sis/20* 

versus 28.55" | 25.3 32.19 
Mao, S20 


Note.—All chi-squares corrected for continuity, df = 1. 
a Indicates the treatment in each comparison in which more 
subjects rewarded for scores below 20; M 


= modeled criteri 
S = criterion imposed ES subject. pras) 


50 WALTER MISCHEL AND RoBERT M. LIEBERT 


Comparisons of the discrepancy treatments 
for each phase of the experiment separately, 
are summarized in Table 1. The findings 
clearly support the hypotheses. The greatest 
stringency was shown by subjects who ob- 
served a stringent model and were themselves 
held to the same stringent criterion (M»o, 
Soo). These children made reward contingent 
on a higher performance level more often 
than those who obtained identical direct 
training but from a model who used a more 
lenient criterion for her own self-reward 
(Mjs,20, S20). Likewise, comparisons between 
discrepancy groups showed, as expected, that 
children who were trained on a stringent cri- 
terion but observed a more lenient model 
(Mj5/20, Sao) were more frequently stringent 
than those who were permitted more lenient 
self-reward but observed a model who was 
stringent in her own self-reward (Mao, S1s/20)- 

Moreover, these differences hold at each 
phase of the experiment: the treatments af- 
fected the subject’s self-reward both as a per- 
former and as a demonstrator as well as the 
reward criteria he imposed on the younger 
child when demonstrating the game to him. 
It is of interest that the effects resulted in p 
values less than .01 in all but one instance. 
The one exception was the chi-square com- 
parison of self-reward in the two discrepancy 
treatments when the subject served as a 
demonstrator and this yielded the least im- 
pressive p value (< .10, two-tailed). Closer 
examination of these data with Fisher exact 
tests revealed that for this phase the dis- 
crepancy treatments resulted in the expected 
differences when the subject served as a per- 
former first (/<.025) but not when he 
served as a demonstrator first (p < .30). 
Thus, although the D-P as opposed to the 
P-D role sequences resulted in no significant 
differences within each discrepancy treatment, 
apparently they did, in this instance, act as 
mediating variables affecting differences be- 
tween the discrepancy treatments. The lack 
of difference between the two discrepancy 
treatments on the self-reward schedule used 
when the subject was initially in the role of a 
demonstrator seems due to the more frequent 
adoption by these children of the model’s own 
reward criteria rather than the criteria that 


the model imposed on them. These trends art 
reflected graphically in Figure 1 which show 
a tendency for subjects in the D-P sequen 
in each discrepancy condition to adopt 
model’s standards rather than those impose 
on them. 

The children tended to use the same 
teria for administering rewards in each pha: 
of the experiment (Figure 1). When 
modeled and imposed reward standards y 
the same (Moo, S29) all subjects reward 
themselves for the same criteria when pel 
forming alone and when demonstrating, a 
applied the same standards to the young 
child. Likewise, when the modeled and in 
posed reward criteria were discrepant, sub 
jects in the P-D sequence showed perfi 
consistency in the reward criteria they en 
ployed across all phases of the experimen 
In contrast, however, among the 18 childre 
in the two discrepancy treatments who wel 
in the D-P sequence, 5 used different criteri 
for themselves when serving as demonstrator 
and when performipg alone. When thé 
served as demonstrators, 4 of these subjec! 
rewarded themselves for*the same criteri: 
that their model had used for herself rathe 
than in accord with the standards she had 
imposed on them. This occurred with equa 
frequency in both discrepancy treatments. Ti 
the treatment in which the model had bee 
stringent but had permitted leniency, 2 chil 
dren rewarded themselves more stringentl} 
when demonstrating than when performin 
Likewise, in the treatment in which the mod 
had been lenient but had guided the chil 
to be stringent, 2 children rewarded thefi 
selves more leniently when they served & 
demonstrators than when they perfo 
alone. These trends, while not significa nt, 
suggest that emulation of the model occuf 
more readily when the child is immediately 
placed in the role of a model himself than 
when he is placed initially in the role of 
performer. Comparisons of the number 0 
subjects in the D-P as opposed to the P-L 
sequence who used the same criteria for then 
selves when they served as demonstrators a 
when they performed alone revealed that 
dren who began as demonstrators tended t 


use different criteria more often than thos 
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` who began as performers (x? = 3.71, df = 1, 
* p< .06). 


DISCUSSION 


The results of this experiment show that 
patterns of selé-reinforcement may be affected 
' jointly* by the criteria "displayed by social 
models and the standards directly imposed 
„On the observer, with the resultant behavior 
` determined by a predictable interaction. of 
* both factors, The hypothesized effects of dis- 
crepancies and consistency between observed 
and imposed reward criteria received strong 
-support. As predicted, when modeled and im- 
posed reinforcement criteria were of different 
/ stringency, subjects adopted the more lenient 
alternative for themselves and this occurred 
more efrequently when they were permitted 
self-reward for a relatively low standard but 
observed a mode] rewarding herself for higher 
achievements than when they were held to 
stringent criteria while observing a model dis- 
playing a more lenient self-reward pattern. 
Subjects in both thesé conditions adopted 
more lenient self-rgward patterns than those 
who were held to a stringent self-reward 
criterion and observed the model using an 
equally stringent reward schedule for herself. 
Most interesting is the fact that children who 
‘were trainfd on a stringent criterion by a 
model who was similarly stringent with her- 
self adopted and transmitted more stringent 
.,reward criteria than those who received the 
identical direct training but from a model 
who exhibited greater leniency in her own 
~ sglf-reward, 

When the módeled and imposed standards 
were consistent they were adopted in the 
model's absence without a single deviation 
in spite of the relative stringency of the 
criterion and the desirability of the rewards 
which were freely avdilable for self-adminis- 
tration without external constraints. Ex- 
tremely precise, but not perfect, adoption of 
reinforcement patterns was also obtained in 
the Bandura and Kupers (1964) study in 
which children observed a model’s self-reward 
patterns without themselves being adminis- 
tered any direct differential reinforcement. As 

- Bandura and Kupers noted, it is likely that 
precise matching is enhanced when relevant 


normative data for performance quality are 
lacking or ambiguous. The fact that in the 
present study children who witnessed a model 
rewarding herself for a high standard and 
who were led to use the same high criterion 
for themselves adopted it in her absence with- 
out any deviations reflects the potency of 
combining modeling procedures with direct 
behavior guidance and reinforcement. It 
should also be noted that in the present 
study, in contrast to previous investigations 
on the modeling of self-reinforcement pat- 
terns, the apparatus permitted the subject to 
perform and to regulate his own reward 
schedules in the absence of both the model 
and the experimenter on all test trials, The 
maintenance of the predicted reward patterns 
in the ‘absence of all external restraints fur- 
ther indicates that the variables studied in 
this experiment are determinants of behav- 
iors frequently used as indices of self-control 
or “internalization.” 

Placing the subject in the role of a demon- 
strator for another person, as opposed to 
leaving him alone in the role of performer, 
had only minimal effects and the reward 
patterns of children in these two role condi- 
tions were not significantly different within 
each main model-subject discrepancy treat- 
ment. Notice (Figure 1) that there were no 
role effects whatsoever when the modeled and 
imposed reward schedules were identical. Tt 
is of interest, however, that role variables ap- 
parently did have indirect effects on the sub- 
ject's self-reward schedule in the two model- 
subject discrepancy conditions when the sub- 
ject was in the role of demonstrator with 
another younger child (self-reward when 
demonstrator in Figure 1). In this role, sub- 
jects who began as demonstrators (D-P 
sequence) tended to use the reward criteria 
displayed by their own model rather than the 
criteria that were imposed on them to a some- 
what, although not significantly, greater de- 
gree than did subjects who were first placed 
in the role of a performer, Thus, in the D-P 
sequence, but not in the P-D sequence, sub- 
jects who had observed a stringent model 
became slightly more stringent and those who 
had observed a more lenient model became 
somewhat more lenient in their self-reward 
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when they themselves served in the role of 
models. Apparently immediate practice of the 
demonstrator’s role, prior to extensive prac- 
tice in the performer’s role, facilitates adop- 
tion of the modeled as opposed to directly 
trained behaviors. 

Subjects tended to transmit to others the 
same reward patterns which they adopted for 
themselves as a function of the criteria which 
they had observed modeled and which had been 
imposed on them, On the whole, the children 
were extremely consistent in the criteria they 
used for their own self-reinforcement, both 
in their roles as performers and demonstra- 
tors, and in those they transmitted to another 
person when they served as his model. How- 
ever, when subjects were immediately placed 
in the role of performers they showed perfect 
consistency in the reward criteria they used 
for themselves and those they imposed on 
another child, whereas when they served as 
demonstrators first there was some inconsist- 
ency. The differences in extent of incon- 
sistency as a function of role treatments are 
suggestive of what might be considered “role 
discrimination.” That is, the inconsistency 
shown by some subjects who served first as 
demonstrators reflected their use of the 
model’s standards rather than those imposed 
on them by the model but only when they 
became models themselves. 

The design of the present study appears to 
have utility for investigating the transmission 
as well as the acquisition of patterns of self- 
reward and further investigation of role 
variations and other variables affecting the 
“cultural transmission” of self-control pat- 
terns seems open to meaningful investigation 
through this paradigm. 

The predicted effects of the model-subject 
discrepancy treatments were so strong both 
when the modeled and imposed behaviors 
were identical, and when the child was per- 
mitted lenient self-reward but observed the 
model rewarding herself stringently, that there 
was virtually no variance in the children’s 
subsequent behavior. However, when the child 
was held to a stringent criterion but observed 
a model who was lenient with herself, ap- 
proximately half the children adopted and 
transmitted the stringent standards imposed 


on them and half used the more liberal cri- © 
teria which they had observed, The predic- 
tion and manipulation of behavior in this 
kind of discrepancy condition appears in- 
triguing for future research and an investiga- , 
tion is planned to isolate some of the determi- 
nants affecting whethes the individual adheres 4 
to the more self-denying schedule which was _ 
imposed on him*or adopts the more generous 
patterns which he observed exhibited by a 
social agent. In addition to individual differ- 
ences in previous social learning histories, 
such factors as the attributes of the model, | 
including his similarity to the subject, and 
the subject's role seem to be relevant varia- ^ 
bles. I 
In the present expériment, data on the J 
children's willingness to delay immediate but 
less valued gratification for the sake of^more 
valued but delayed rewards, elicited in simple? 
real choice situations, were collected in the 
manner described previously (e.g., Mischel & J 
Gilligan, 1964). No relationships approach- 
ing significance were,found between this as- 
pect of self-control and the children's behavior 
following the discrepancy treatments. Al-| 
though certainly not definitive, this finding is 
in accord with numerous other recent investi- 
gations on self-control in pointing to the 
specificity of different aspects of self-control 
behavior and suggesting that such behaviors 
as delay of gratification and the regulation of 
self-reward schedules for particular perform- 
ance contingencies may be relatively inde- 
pendent and governed by different antecedents 
without any underlying unitary moral agency, a 
(Aronfreed, 1964; Bandura & Walters, 1963). 
The overall findings of this experiment ap- 
pear to have clear implications for socializa- 
tion practices and therapy. The study dem- 
onstrated that»consistency in the standards 
which an individual is trained to use for him- 
self and those he observes used by social 
agents facilitates the adoption and transmis- 
sion of these standards and pointed to some 
of the variables that can determine the per 
formance levels which the person adopts for 
his own se]f-reward and for reinforcing others. 
Many cultural and individual differences re 
flect differences in the kinds and liberalness 
of criteria used for the administration 0! 


i 


REWARD TRANSMISSION 53 


rewards and punishments. There is abundant 
clinical evidence that for troubled individuals 
the inappropriate regulation of self-admin- 
istered rewards and punishments often is a 
central problem. A host of deviant behavior 
patterns, such as psychopathy, masochism, 
depression, satlism, etc,, may be construed as 
reflecting the inappropriate regulation of self- 
administered rewards and punishments and 
the imposition of excessively harsh or generous 
standards on other people. The isolation of 
“antecedents of this aspect of self-control 
therefore seems to have particular importance, 
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VICARIOUS CLASSICAL CONDITIONING AS A FUNC- 


TION OF AROUSAL LEVEL ' 


ALBERT BANDURA anp THEODORE L. ROSENTHAL 
Stanford University 


This study investigated the effects of emotional arousal manipulated both 
psychologically and physiologically, on vicarious conditioning processes. 5 groups 
of observers underwent procedures designed to induce differential degrees of 
arousal The observers then participated in a vicarious aversive conditioning 
paradigm in which a model exhibited pain cues in conjunction with an 
auditory stimulus, and the acquisition and extinction of observers! emotional 
responses to the conditioned stimulus were studied. The results disclosed that 
vicarious conditioning is positively related to degree of psychological stress; a 
monotonic decreasing function is obtained when, in addition to situational 
stress, Ss experience increasing physiologically induced arousal. There is also 
some suggestive evidence that the disruptive effects of high levels of arousal 
may be mediated by self-generated competing responses designed to neutralize 
the aversiveness Jf the vicarious'instigation situation, 


Increasing attention has been drawn in re- 
cent years to the influential role of vicarious 
experiences in the social-learning process. 
Most relevant research, however, has been 
essentially confined to the transmission of in- 
strumental classes of responses as a function 
of exposure to realdife or symbolic models 
(Bandura, 1962, 1965; Bandura & Walters, 
1963). Vicarious classical or respondent con- 
ditioning, on the other hand, has received sur- 
prisingly little experimental attention despite 
ample evidence from informal observation 
that emotional responses are frequently ac- 
quired through observation of the pain and 
fear reactions exhibited by other persons ex- 
posed to aversive stimuli; conversely, positive 
incentive learning may also occur on a vicari- 
ous basis by observing others experiencing 
Positive reinforcement in contiguous associ- 
ation with discriminative stimuli. 

In laboratory investigations of vicarious 
classical conditioning (Barnett & Benedetti, 
1960; Berger, 1962), one person, the per- 
former or model, typically undergoes an aver- 
sive conditioning procedure in which a form- 
erly neutral stimulus is presented, and shortly 

1This investigation was supported by Research 
Grant M-5162 from the National Institutes of Health, 
United States Public Health Service. 

The authors are indebted to David Polefka who 
assisted with the collection of the data, and to Dan- 
iel J. Feldman, Director of Rehabilitation Medicine, 
for his aid in arranging the research facilities at the 
Stanford Medical Center. 


thereafter the model displays pain cues and 
other emotional reactions supposedly in re- 
Sponse to an unconditioned aversive stimu- 
lus. If an observer witnesses the model under- 
going this conditioning procedure, the ob- 
server will also begin fo exhibit emotional re- 
sponses to the conditioned stimulus alone, 
even though he has not himself experienced 
the aversive stimulation directly. 

Although the process of vicarious condi- 
tioning has been clearly demonstrated ( Ber- 
ger, 1962), wide interindividual Variability 
has been noted in the acquisition rate and 
stability of vicariously acquired conditioned 
responses. Since this process, which is most 
likely mediated through stimulus generaliza- 
tion, requires the observer to experience vi- 
cariously another person's pgin responses, 
thereby producing emotional arousal in the 
observer, it seems plausible to hypothesize 
that variables which influence an observer's 
general level of fmotional arousal will partly 
determine the rate and stability of vicarious 
learning. s) 

There are numerous investigations of aver- 
sive classical conditioning as a function of 
subjects’ arousal level in which arousal is 
either manipulated directly by varying the 
intensity of the unconditioned stimulus or 
assessed in»terms of personality measures of 
emotional responsivity. Typically these studies 
have shown that conditioned responses are 
developed more rapidly and, once acquired, 
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extinguish less readily under conditions of 
high, as compared to low, arousal (Doerfler 
& Kramer, 1959; Spence, 1958, 1964). From 
the latter findings it might be expected that 
vicarious conditioning would likewise be 
positively related to degree of psychologically 
induced arofsal  .,., 

A considerable body of recent experimenta- 
tion exploring the interaction of social-stim- 
ulus and physiological determinants of emo- 
tional states (Schachter, 1964; Schachter & 

." Singer, 1962; Schachter & Wheeler, 1962) 

' indicates that administration of epinephrine, 
a sympathetic stimulant, may enhance per- 
sons’ susceptibility to*modeling influences. In 
particular, given epinephrine arousal without 
accurate information of its side-effects, sub- 
jects displayed much greater matching of 
motlels’ aggressive, euphoric, and jocular jbe- 
havior than subjects who were exposed to 
these models without prior physiological 
arousal or were given a sympathetic depres- 
sant. 

Findings from studies concerning the effects 
of autonomic arousal on fearful and avoidant 
behavior, although not employing modeling 
procedures, nevertheless have implications 
for the vicarious instigation and acquisition 
of affective responses. Singer (1963), for ex- 
ample, found at the infrahuman level that 
rats injected with epinephrine displayed con- 
siderably more fear in response to aversive 
stimuli than did placebo- or chlorpromazine- 
injected animals. However, available evidence 
(Latané & Schachter, 1962) indicates that 
acquisition of emotional responses through 

` e direct aversiye conditioning is significantly in- 
fluenced by the dose level of adrenalin em- 
ployed: Small doses of adrenalin generally 
facilitate avoidance conditioning, whereas 
large doses have negligible, effects on avoid- 
ance behavior, suggesting a nonmonotonic re- 
lationship between" autonomic arousal and 
conditioned emotional responses. — 

The present experiment was designed to ex- 
plore the effects of varying degrees of 
arousal, manipulated both psychologically 
and physiologically, on vicarious conditioning 
processes. Subjects participated in a vicarious 
aversive conditioning paradigm in which a 
model emitted pain cues in conjunction with 
an auditory conditioned stimulus, and the 


observers’ acquisition and extinction of auto- 
nomic responses to the conditioned stimulus 
were studied. The following treatment condi- 
tions were included in the experiment: 

1. No injection-nonthreat condition. These 
observers were subjected to no direct experi- 
ences of an emotion-provoking sort, and con- 
sequently provide an index of vicarious con- 
ditioning under relatively low arousal. 

2. Placebo injection, Subjects in this condi- 
tion received a placebo hypodermic without 
any knowledge of its contents which, for most 
subjects, constituted a moderately anxiety- 
arousing experience. 

3. Placebo injection plus threat of aversive 
stimulation, This group of observers, which 
also received the placebo injection, was in- 
fornted that folldwing the conditioning of the 
model, they too would undergo the painful 
shock stimulation, The threat of impending 
shock was designed to induce an additional 
increment of emotional arousal. 

4. Epinephrine-induced arousal; Small dose. 
Observers assigned to this group received a 
dose of epinephrine sufficient to produce a 
noticeable physiological effect. 

5. Epinephrine-induced arousal: Large dose. 
The dosage level employed in this condition 
was capable of producing sizable sympathetic 
arousal. 

The two sets of operations thus provide 
three degrees of psychologically induced emo- 
tional arousal (ie., nonthreat, placebo injec- 
tion, placebo injection plus shock threat) and 
three points on a physiological arousal con- 
tinuum (epinephrine dosage of .2 and .5, 
with the placebo injection condition serving 
as a 0 dosage group). 

Individuals have been shown to differ 
markedly in their predispositions to emo- 
tional responsivity under conditions of aver- 
sive stimulation. To the extent that vicarious 
learning is partly governed by arousal level, 
the rate of conditioning is likely to reflect the 
combined effect of momentary states of 
arousal and emotional proneness. Therefore, 
in order to test for expected interactive ef- 
fects of response-defined proneness to emo- 
tionality and experimentally induced arousal 
on vicariously acquired responses, the subjects 
in the experiment were administered a meas- 
ure of emotional proneness, 
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Since emotional response predispositions 
are unlikely to be activated under nonthreat- 
ening conditions, no relationships between 
the personality and the conditioning measures 
were expected among subjects in the low- 
arousal group. On the assumption that the 
experimental manipulations would be suffi- 
ciently emotion-provoking to elicit existing 
differential dispositions but insufficient to 
evoke debilitating levels of arousal, it was 
anticipated that emotionality and vicarious 
conditioning would be positively related in 
the stressful treatment conditions. 


MzrHOD 
Subjects 


A total of 100 paid volunteers, 20 in each treat- 
ment condition, participated in tàe experiment.* Be- 
cause use of injection procedures required subjects" 
written consent, the study was confined to college 
students who were 21 years of age or older. 


Procedure 


The subjects first reported individually to the 
Psychology Department, where they were led to 
believe that the study for which they had volun- 
teered was concerned with the eífects of certain 
common, but unspecified, drugs upon psychomotor 
performance. Based on preliminary information, 
several cases were excluded because of physical con- 
traindications, and four volunteers withdrew from 
the experiment because they had marked fears of 
hypodermic injections. During this initial session the 
subjects signed, in the presence of witnesses, a 
legalistic statement releasing the experimenters from 
any liability. In order to reinforce further the set 
that the subjects would, in fact, receive injections of 
pharmacologically active agents, all subjects except 
those in the no injection-nonthreat group were in- 
formed that some effects of the drugs might persist 
beyond the test period, and for this reason they 
were cautioned against strenuous activities or en- 
gaging in any endeavors requiring special skills or 
fine coordination for at least 12 hours after the 
experimental session. In addition, subjects com- 
pleted the Taylor Manifest Anxiety (MA) scale to 
provide a measure of emotional proneness, and were 
then instructed to report several days later to the 
Stanford Medical Center where they would be 
tested in pairs for reasons that would be explained 
then. 

With subjects assigned to the no injection-non- 
threat group, there was no mention of drugs, nor 
were they asked to sign the ominous medical-liabil- 
ity form. To further reduce any possible situational 
stress, these subjects were told that the experiment 
was being conducted in the medical setting simply 
because the laboratory equipment was located 
there. 


In the second session of the study, in order to 


enhance the credibility of the situation, the experi- 


menter's confederate,? a male college student who 
served as the model for all subjects, timed his ap- 
pearance at the laboratory either to coincide with, 
or to follow, the arrival of the subject. After pre- 
liminary introductions, the experimenter announced 
that the model had been assigned on a random basis 
to the “pain-emotion” cordition, whereas the sub- 
ject would simply serve as his matched control to 
provide a base line during the session for physio- 
logical response to the particular drug, independent 
of any painful stimulation, The experimenter then 
asked the model whether he had any objections to 
receiving relatively painful shocks of moderate in- 
tensity since effects of both pkarmacological and 
pain-producing factors on perceptual-motor co- 
ordination were being studied. After exhibiting mild 
hesitancy, the model expressed his willingness to 
undergo the procedures, In order to counteract any 
possible self-induced emotional arousal, should the 
observer expect that he might also be the recipient 
of aversive stimulation, the model asked whether 
electric shocks would be administered both to him 
and to his partner, The experimenter repeated that 
the observer was a control subject and, therefore, 
would not be shocked at any time. Subjects in the 
shock-threat condition, however, were informed 
that they would be subjected to the same painful 
stimulation following the ccmpletion of the model's 
performance, 

The experimenter then ushered ¢he model off, sup- 
posedly to receive his drug injection. After the lapse 
of an interval equivalent to that required to ad- 
minister the hypodermic, the model returned to the 
experimental room buttoning his left shirt-sleeve. 
The subject was then sent to the physician for a 
brief examination and the injection. The necessity for 
following a double-blind procedure in drug research 
was the reason offered to justify keeping the par- 
ticipants uninformed about the nature of the phar- 
macological agents being administered. 

For subjects in the no-injection condition, the 
vicarious conditioning phase of the experiment com- 
menced immediately upon the model's return, 


Injections. Subjects in the placebo and the shock- 


threat conditions received a subcutaneous injection 
of .5 cubic centimeter of saline solution. 

The epinephrine dosages were selected primarily 
on the basis of findings from other experiments 
(Clemens, 1957; Schachter & Singer, 1962), that have 
employed different levels within the effective dose 
range. Subjects in the epinephrine-large dose condi- 
tion received a subcutaneous injection of .5 cubic 
centimeter in 1:1000 saline solution, which is suffi- 
cient to produce sizable physiological arousal. In 
the epinephrine-small dose condition, the subjects 
received a subcutaneous injection of .2 cubic centi- 
meter in 1:1000 saline. The conditioning phase of 
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the experiment was begun approximately 5 minutes 
after injection. 

Vicarious conditioning. While the subject was re- 
ceiving his injection, GSR electrodes were affixed to 
the fingers of the model's left hand to maintain 
verisimilitude, and dummy shock electrodes were 
attached to the wrist of the model's right hand. 
After the subject entered the experimental room 
he was seated comfortably in a chair positioned to 
provide a clear right-side view of the model. A GSR 
bipolar pickup device—§ X 1 sinch electrodes bent 
to the contours of fhe subject's index and third 
fingers coated with electrode paste—was then firmly 
taped to the fingers of the left hand. 

Resting on the table immediately in front of the 
model was a pursuit-rotor apparatus which served as 
the cover task for presenting the conditioned stimu- 
lus-unconditioned stimulis (CS-UCS) pairings to 
the model. This particular orienting task was se- 
lected because it could be effectively utilized to 
channel the subject's*observing responses and to 
enhance the pain-response cues emitted by the 
model. The experimenter explained to the model, as 
though he were a naive subject, that the apparatus 
provided a sensitive measure of motor coordination. 
He was instructed to keep the end of the stylus on 
the small target circle located on the turntable as 
best he could, while the turntable revolved. It was 
further explained that at periodic intervals a buzzer 
would sound, and shorfly thereafter a moderately 
painful shock would be administered to the model's 
right hand, althoufh he might not receive a shock 
every time the buzzer sounded. The observer, in 
turn, was asked to sit quietly and to observe closely 
the model's performance, ostensibly to duplicate the 
model's task-relevant stimulation but, in fact, to 
ensure tbe occurrence of the necessary observing 
responses, 

After the injection and the presentation of in- 
structions, the adaptation phase of the experiment 
was begun. The purpose of this phase was to neu- 
tralize any aversive properties of the buzzer, which 
served as the CS, and to allow observers to adapt 
to the apparatus and procedures. The adaptation 
series consisted, of repeated presentations of the CS 
alone until the observer failed to exhibit any re- 
sponses on three consecutive trials. 

The adaptation phase was followed immediately 
by the vicarious acquisition series consisting of 10 
conjoint presentations of the «CS and the model's 
pain responses, In each of these trials the buzzer 
was sounded and appfoximately .5 second after the 
onset of the CS the model suddenly flexed his right 
arm, dropped the stylus and winced, creating the 
impression that a painful shock had been delivered. 
These pain reactions were convincingly feigned and 
no shocks were in fact administered to the model. 
Six CS-alone trials were interspersed among the 10 
vicarious acquisition trials as tests of the degree to 
which the CS was accruing conditioned aversive 
properties, During the test trials the buzzer was 
sounded, but the model exhibited no response what- 
soever. The order of the test trials was 2, 6, 9, 10, 
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TABLE 1 


DIFFERENCES IN PHYSIOLOGICAL REACTIONS REPORTED 
BY GROUPS or SUBJECTS RECEIVING INJECTIONS 
AND IN MEAN SKIN CONDUCTANCE LEVELS 


Conductance levels 

Number ‘in microohms) 

Numbi 

Treatment pede: of reac 

conditions rti; ‘ions P 
Tehetions | reported | adusta- | adapta- 
tion tion 

Epinephrine.5| 20 46 14.1 15.0 

Epinephrine .2 14 26 15.4 16.4 

Shock-threat 7 8 15.9 18.4 

Placebo 3 4 14,9 18.9 

Nonthreat 18.1 20.9 


12, and 15. At the completion of the acquisition- 
test series all subjects were given 10 extinction trials 
in which the CS was presented alone. 

Intertrial intervals at each phase of the experi- 
ment were varied systematically in an irregular 
fashion within 15-40 seconds. The total time elapsing 
between the beginning of the adaptation and com- 
pletion of the extinction series was approximately 
15 minutes. 

The experiment was concluded by having the sub- 
jects complete a questionnaire in which they re- 
corded the somatic reactions that they experienced 
following the injection, checked on graphic rating 
scales both the severity of shocks administered to 
the model and the amount of discomfort produced 
by his pain reactions, and described their thoughts 
and response during the period when the model was 
subjected to the aversive stimulation. 


Vicarious Conditioning Scores 


The observers’ skin resistance was continuously 
recorded on the Grass Model 5 polygraph; GSR 
responses were defined as a change in the direction 
of lowered resistance of 2000 ohms or greater oc- 
curring within 5 seconds of the CS onset. The poly- 
graph records were scored independently by the ex- 
perimenter and a second judge who scored 25% of 
the records, drawn at random from each of the 
treatment conditions, without knowledge of the 
subjects’ group assignments. The scorers agreed on 
97% of the trials concerning the presence or ab- 
sence of a GSR response, 


RESULTS 


As an independent check on whether or not 
the injections had in fact produced differential 
degrees of sympathetic arousal, the subjects’ 
questionnaire data were scored for the fre- 


‘quency of different physiological reactions 


which are typically associated with epineph- 
rine (i.e, palpitation, tremor, flushing, and 
accelerated respiration). As shown in Table 
1, there were substantial differences both in 
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Fic, 1. Mean percentage conditioned GSR responses 
exhibited by subjects on each of three test periods 
for each of five treatment conditions representing 
differential levels of arousal. 
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i EXTINCTION 
PHASES OF THE EXPERIMENT 


the total number of subjects in each treat- 
ment condition who reported reactions indica- 
tive of high arousal, and in the total amount 
of reactivity, Analysis of variance of the lat- 
ter scores by means of the Kruskal-Wallis 
test reveals that the differential effects pro- 
duced by the experimental procedures are 
highly significant (E = 41, p < .001). Fur- 
ther intergroup comparisons based on the 
Mann-Whitney U test show that the epine- 
phrine .5 subjects experienced more physio- 
logical reactions than the epinephrine .2 group 
(b < .01), and both groups of subjects re- 
ceiving placebo injections (p< 001). The 
epinephrine .2 condition, in turn, produced 
higher scores than either the placebo (p< 
.001) or the shock-threat group (p< 01), 
which did not differ from each other, 


TABL. 
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Measurements of skin conductance level 
taken immediately prior to, and following 
completion of, the adaptation series are re- 
ported in Table 1. Analysis of variance of 
both sets of scores disclosed no significant 
differences, Nor did the groups of subjects 
differ in their estimates of the riumber and 
severity of shocks administered to the model, 
the degree of paim experienced by the per- 
forming model, and the amount of empathy 
that they felt for their suffering counterpart, 


Trials to Adaptation 


The number of trials required to neutralize 
the subjects’ responses td the CS varied closely 
around a mean value of 17.5. Analysis of 
variance of the adaptation data revealed no 
significant difference among treatment condi- 
tions. Thus, the arousal manipulations did aot 
produce any systematic differential responsiv- 
ity to the CS prior to the conditioning proc- 
ess, 


D 


Acquisition Series 


Figure 1 shows the pércentage of the total 
number of conditioning trials in which sub- 
jects from the various groups exhibited GSR. 
responses. Since the sets of scores for the ac- 
quisition, the test for conditioning, and the 
extinction series departed from normality, 
nonparametric techniques were empioyed in 
estimating the statistical significance of the 
obtained differences. 

As shown in Figure 1, subjects in all groups 
displayed a high frequency of GSR responsiv- 
ity to the stimulus complex containing both 
the CS and the model's pain cues. Analysis . ` 
of variance performed on these data by means 
of the Kruskal-Wallis test revealed no sig- 
nificant group differences (Table 2) in this 
phase of the experiment, 


E2 


SIGNIFICANCE OF DIFFERENCES IN CONDITIONED RESPONSES BETWEEN AROUSAL CONDITIONS 


Comparison of treatment conditions (p values) 


Experi- ; 
mental rad P | Placebo | Placebo | PI. Pla Non- Non- Non- Shock | Shock | Epineph- 
m versus | versas | versu AE | teat | threat | threat | threat | threw | ‘ne 

inl throat | rine 2 | nines E — RC ERE ERE 3b x) 
Ton ee $60 Par Maer, ns <.025 ns ns n <02 st 
Extinction |14.73| <.01| <.01 ns «.05 ns ns ns i 
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TABLE 3 


RANK CORRELATIONS BETWEEN EMOTIONAL PRONENESS AND VICARIOUS CONDITIONING 
SCORES For EACH OF THE FIVE TREATMENT CONDITIONS 


Conditioning scores 


Treatment conditions 


Nonthreat Placebo Shock-threat Epinephrine .2 Epinephrine ,5 
a 
Test*for acquisition | * * .18 —.23 .09 .51* —.45* 
Extinction 07 —.10 .06 .50* —.62** 
P 
*p «.05. E 
DD <.01 


` Test for Conditioning 

During the acquisition series, particularly 
in the initial trials, observers’ emotional re- 
sponses are most likely elicited directly by 
the models pain «reactions. Consequently, 
demonstration of vicarious conditioning ef- 
fects requires the occurrence of conditioned 
responses to the CS in the absence of the 
model's behavior. As can be seen from Figure 
1, the auditory stimulus itself had acquired 
differential aversive properties among the 
groups of observers subjected to varying de- 
grees of arousal, The*overall differences yield 
a significance value beyond the .05 level 
(Table 2). Further comparisons of pairs of 
scores by the Mann-Whitney U test reveal a 
significantly higher rate of conditioned re- 
sponses among subjects in the shock-threat 
conditioif relative to both the low aroused no- 
injection group, and the high aroused epineph- 
rine-arge dose group, which do not differ 
from each other. Observers in the placebo 
condition also displayed a higher level of con- 
ditioning than subjects in the .5 epinephrine 
» ' egroup, although the latter difference is slightly 
below the .10 significance level. 


Extinction 


The differential vicarious conditioning 
noted in the test trials becomes even more 
pronounced in the® extinction phase of the 
experiment (Table 2). Observers in both. the 
placebo and the shock-threat groups continue 
to exhibit a significantly higher level of con- 
ditioned responses than either the no-injec- 
tion or the .5 epinephrine groups. Although 
the two sets of data yield essentially the same 
relationships between arousal and vicarious 
conditioning, the relative positions of the 
shock-threat and placebo treatment condi- 


tions are reversed so that the extinction com- 
parisons involving the latter group yield dif- 
ferences of larger magnitude, and even a 
differentiation at a borderline level of signifi- 
cance from the .2 epinephrine group. 


. e 
Emotional Predisposition and Vicarious 
Conditioning 


In order to determine the degree of rela- 
tionship between predisposition to emotional 
arousal and vicarious conditioning, the MA 
scale scores, which were comparable across 
conditions, were correlated separately within 
groups by the rank-order method with the 
measures of acquisition and extinction. The 
obtained correlation coefficients (Table 3), 
corrected for tied ranks, disclose no significant 
relationships between the two sets of variables 
in the no-injection, placebo, and shock-threat 
condition, but moderately high positive co- 
variations for subjects receiving the .2 dose of 
epinephrine. By contrast, emotional predis- 
position was highly negatively correlated with 
vicarious conditioning under conditions of high 
physiological arousal. 


Discussion 


The present experiment provides further 
evidence that conditioned emotional responses 
can be transmitted vicariously, In addition, 
the overall findings reveal that observers’ emo- 
tional arousal is a significant determinant of 
vicarious conditioning. This is shown in the 
fact that frequency of conditioned responses 
is a positive function of the degree of psycho- 
logical stress. However, a monotonic decreas- 
ing function is obtained when, in addition to 
situational stress, subjects experience increas- 
ing physiologically induced arousal. 
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Tf it can be assumed that the five treatment 
conditions represent increasing levels of emo- 
tional arousal on a single dimension, then the 
combined results suggest an inverted U rela- 
tionship between arousal level and vicarious 
conditioning. There are two sets of data that 
lend some support to this interpretation, It 
will be recalled that the shock-threat condi- 
tion yielded the most significant differences 
in acquisition scores, but the placebo condi- 
tion emerged superior in the extinction phase 
of the experiment. This reversal is probably 
due to the fact that the threat of impending 
shock stimulation produced a further height- 
ening of emotional arousal in shock-threat 
observers as they entered the extinction series 
of trials. Results based on the within-treat- 
ments correlational analyses disclose that 
emotionality and vicarious conditioning are 
essentially unrelated at low and moderate 
levels of arousal, positively correlated as 
arousal is further increased, and highly in- 
versely related under conditions of strong 
physiological arousal, suggestive of a non- 
monotonic relationship. 

The failure of the low-aroused subjects to 
exhibit much vicarious conditioning is readily 
explainable in terms of an activation hypothe- 
sis, but the equally poor conditioning in sub- 
jects administered the large dose of epineph- 
rine may suggest alternative interpretations. 
One possible explanation is that epinephrine 
in the high dosage range has an inhibitory 
effect on the GSR response itself. This in- 
terpretation, however, cannot account for the 
differential conditioning rates, since no sig- 
nificant differences were obtained among 
groups in both the total number of trials to 
adaptation, and the frequency of GSR re- 
sponsivity during acquisition when the stimu- 
lus complex contained both the CS and the 
model’s pain cues. Moreover, the fact that 
test trials were interspersed with acquisition 
trials, and the entire conditioning series was 
completed in a relatively brief period of time, 
rules out the possibility of any significant 
temporally related changes in drug action. 
Finally, if epinephrine had a suppressive ef- 
fect on the GSR response, this outcome 
would have precluded high correlations be- 
tween emotional responsivity and vicarious 
conditioning. 


Although the overall findings provide evi- 
dence of a relationship between arousal level 
and vicarious conditioning, the manner in 
which arousal produces facilitative or disrup 
tive effects remains to be demonstrated. Sub: 
jects’ replies to the postexperimental question- 
naire suggest that disruptive effects may, in 
part, be mediated by self-generated compet- 
ing responses designed to reduce the aversive- 
ness of the vicarious instigation situation, In 
some cases this took the form of an intensive 
focus on irrelevant external stimuli, to 
exclusion of the disturbing pain cues (“When 
I noticed how painful the shock was to him 1 
concentrated my vision on a spot which di 
not allow me to focus directly on either his 
face or hands.”) Other subjects engaged in 
an extended series of avoidant responses in ai 
effert to find one that would be effectivé it 
reducing their discomfort (“The first 3 or 4 
shocks, I thought about the amount of pain 
for the other guy. Then I began to think to 
minimize my own discomfort. T recall looking 
at my watch, looking out the window, and 
checking things about the room. I recall that 
the victim received a shock when I was think- 
ing about the seminar, and that I didn't seem 
to notice the discomfort as much in this in- 
stance.") Like the latter subject, most ob- 
servers attempted to decrease the. aversive 
stimulation arising from the model’s pa 
reactions by conjuring up competing cogni- 
tive responses (“I tried to think of other 
topics; general elections in Britain, will Wil 
son become a Prime Minister, academic prob 
lems, planned trip to New York. I was not. 
able to keep thinking on any topic too con-! 
sistently and my thoughts rather broke dow 
after a while... ." “I tried to be cool. 
thought about Latin verbs and about Latin 
compositions.”). «A few subjects, however, 
marshaled considerably more potent contra 
vening cognitive responses (“I finally jus 
tried to think about the girl I slept with last 
night. It kept my mind off those damn 
shocks.”). To the extent that an observer who 
is confronted by a vicarious instigation situ- 
ation succeeds either in attenuating distress- 
producing arousal by performing competing 
responses, or in curtailing attentional re- 
sponses to the relevant discriminative stimuli 
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the CS is likely to become endowed with rela- 
tively weak aversive properties. 

As a partial check on the competing-re- 
sponse hypothesis the questionnaire data were 
scored for the number of subjects in each of 
the treatment conditions: who reported delib- 

` erately engaging in yarious. avoidant and 
stimulus . neutralization’ stratagems. These 
types of responses occurred mošt frequently 
àmong subjects in tke epinephrine .5 condi- 
tion; not a single observer in the shock-threat 
" rroup noted responding i in this manner, while 
each of the remaining groups contributed a 
few subjects. The obtained group differences, 
"although significant (X? — 8.50, p< .05), 
should be accepted with reservation consider- 
ing the limited number of cases involved, and 
the fact that the expected frequencies in some 
of the cells were relatively small. . 

While the above supplementary findings 
have suggestive value, and are consistent with 
data from studies of direct instrumental 
learning demonstrating that high emotional 
arousal reduces cue “utilization (Easterbrook, 
1959; Kausler & Trapp, 1960), the degree to 
which response-competing processes can dis- 
rupt vicarious conditioning must be estab- 
lished empirically through systematic manip- 
ulation of appropriate mediating responses. 

It should be noted in passing that, unlike 
direct classical conditioning in which the sub- 

‘ject is unable to modify the intensity of 
aversive stimulation administered to him, in 
vicarious conditioning situations one can 
readily engage in response-interference strata- 
.gems designed to attenuate vicariously insti- 
' gated affective reactions. For this reason, in- 
vestigations of direct and vicarious classical 
conditioning may not always yield equivalent 
relationships between variables. Similarly, 
findings based on vicarious ,classical condi- 
tioning may not be..applicable to modeling 
processes involving nstrumental classes of 
responses, Thus, as demonstrated in Schach- 
ter’s experiments, a person experiencing high- 
intensity autonomic responses may welcome 
the opportunity to engage in matching social 
behavior, whereas in a classical conditioning 
situation permitting no motoric* responses 
high-aroused subjects can resort only to stim- 
ulus neutralization tactics as a means of re- 
ducing their discomfort. 


The questionnaire data reveal additional 
complexities in the vicarious conditioning 
process that require systematic investigation. 
It was assumed that vicarious instigation of 
emotional responses is mediated by a process 
of stimulus generalization. That is, stimuli 
impinging upon a given person and the at- 
tendant reactions will arouse in the observer 
analogous emotional responses, the magnitude 
of the responses being a function of the de- 
gree of similarity between the participants. 
One would expect persons who possess similar 
characteristics to share many experiences in 
common, Results of experiments with infra- 
human subjects furthermore reveal that the 
experience of repeated paired consequences is 
an important determinant of vicarious arousal, 
Church (1959), fof example, found that rats 
subjected to paired aversive consequences 
subsequently exhibited greater emotional re- 
sponsivity to the pain cues emitted by another 
rat than a group of animals that had received 
the same amount of aversive stimulation, but 
unassociated with the pain responses of an- 
other member of their species. Moreover, em- 
ploying an interanimal avoidance conditioning 
procedure, Murphy, Miller, and Mirsky 
(1955) demonstrated that emotional responses 
in monkeys could be vicariously elicited not 
only by the sight of their experimental coun- 
terpart, but also through stimulus generaliza- 
tion by another monkey that was never in- 
volved in the original aversive contingencies. 

In the. present experiment, the self-report 
data indicated that with a few notable excep- 
tions subjects did, in fact, experience strong 
empathetic emotional reactions. Several of the 
observers, however, derived considerable sat- 
isfaction from witnessing pain being inflicted 
on the model (e.g., “My main reaction was 
sadistic. My main thoughts were, ‘Oh boy, is 
he getting it... .’” “I was rather embar- 
rassed to see that I was grinning when my 
partner got shocked and dropped the stylus 
with a suppressed groan. . . .” “I, at times, 
sadistically wanted him to get shocked."). 
The total number of cases is too small for 
comparative analysis of vicariously condi- 
tioned responses. It is planned, however, to 
study the level of vicarious conditioning as a 
function of paired aversive consequences, 
paired opposing consequences, and unassoci- 
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ated negative outcomes experienced by the 
model and observing subjects. 
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one, 
Ss were 199 male and female college freshmen who completed the Activities 
Index (AI) and the semantic differential (SD). Ss’ needs for dependency, emo- 
tional expressiveness, and intellectual orientation on AI were compared with 
their projections of potency and activity on parents. SD data were factor 
analyzed for scores on “My Mother” and “My Father.” Ss were also ranked 
(AI) according to the strength of their needs, Comparisons were made between 
differept groupings of the projection and need scores. Results indicated that 
Ss’ perceptions of strength in both parents separated dependent from non- 
dependent Se better than perceived trait in either parent alone. Father was 
more influential with respect to male needs than mother was to female needs. 
The factor of activity in mothers related to greater male expressiveness and 


higher intellectual orientation, Passivity in fathers related to higher female 
dependency and greater intellectual orientation.* . 


LJ 
There has been considerable study of the 
. process of identification in children and ado- 
lescents. Growing out of the empirical re- 
search and complementing it have been an 
increasing number of consistent theoretical 
speculations regarding the nature of the 
identification process. Identification theory as 
formulated by Freud (1959) centered around 
the concepts of renunciation, regression, in- 
trojection, and superego formation. According 
to the theory, renunciation of the cross-sex 
, parent ad regression to the oral stage for 
purposes of introjecting the like-sex parent 
occurred in the interests of superego forma- 
tion and adequate psychosexual development. 
Sanford (1955) has noted that since Freud's 
_original constructs were set down, the concept 
of identificatign has taken on a variety of 
meanings dependent upon the context in which 
it is found. In his evaluation of the usefulness 
of the term, Sanford redefined its boundaries 
and suggested that the term may have limited 
value as an explanatory concept for much of 
normal personality development. 
One implication of the kind of formulation 
presented by Freud is that the adequacy of 
: the subsequent adjustment is a function 
largely of the like-sex parent. In this respect, 
the increasing diffusion of the meaning of 
identification has been helpful in that it has 
led to a broadening of the base of re- 
search about identification theory with some 
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stimulating reformulations about the familial 
antecedents to personality development. 

Recent attempts to abstract a concept of 
identification from empirical data have in- 
cluded those of Kagan (1958), McCandless 
(1961, pp. 338-353), and Slater (1961). 
Utilizing the findings of such research as that 
of Carlson (1963), Gray (1959), and Lazo- 
wick (1955), the contention is generally ad- 
vanced that the antecedents of identification 
reside in the milieu of the familial interaction 
and that the process of identification is not 
an either-or proposition but can vary along 
such dimensions as strength of identification 
and parental influence. Cross-identification is 
accepted as a culturally determined phe- 
nomen of the normal developmental process 
although the strength of the like-sex identi- 
fication has been found to be more essen- 
tial to adequate male rather than female 
adjustment. 


PROBLEM 


The central propositions about personal- 
ity development suggested by identification 
theory are that the consequent personality 
structure of the person is more a function of 
both parents in interaction than of the inde- 
pendent contribution of either parent, and, 
that a supportive familial milieu is most 
productive of adequate adjustment. In this 
study, these propositions offered a starting 
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point for studying the relationship between 
the personality structures of a group of sub- 
jects and the traits that they projected onto 
their parents. 

Specifically, the purposes of this study were 
to investigate in an exploratory fashion the 
relationships that obtained between the 
strength and activity that subjects projected 
onto their parents and their needs for de- 
pendency, their emotional expressiveness, and 
their intellectual orientation. In accordance 
with identification theory, one would expect 
that the need structures of subjects would 
represent a mixed contribution from both 
parents and that the subjects with difficulty 
in the expression of needs would be those 
subjects whose parents had failed to provide 
a supporting relationship. ' d 


METHOD 
Subjects and Instruments 


The subjects for this study consisted of 90 male 
and 109 female University freshmen students who 
were selected in a random way during registration 
time from among all incoming freshmen students. 
All of the subjects were tested together in a single 
group setting during a 2-hour block of time. The 
subjects first described themselves on Stern’s Activ- 
ity Index (AI) Form 1158 (Stern, 1963) which was 
administered according to the usual directions 
and then they completed a form of the semantic 
diferential (SD) that was prepared for this study. 


Activities Index (AI) 


The AI is a personality inventory based on the 
mergence of a theory of need structure with that of 
personality patterns (Stern, 1958). The AI consists 
of 300 items, 10 each of which are subsumed under 30 
independent needs. From these 30 subtests, 12 first- 
order personality factors can be obtained which may 
also be described as 3 second-order factors. 

Since these second-order factors represent the most 
reliable index of a subject’s need structure, these 
factor scores were used in testing the study hypothe- 
ses. There are three such second-order factors: intel- 
lectual orientation, dependency needs, and emotional 
expression, The subtests and the weights that consti- 
tute these factors are described in the manual. Emo- 
tional expression, for example, consists of five first- 
order factors: supplication, sensuousness, friendliness, 
expressiveness-constraint, egoism-diffidence, and self- 
assertion. 

For purposes of the analyses carried out in this 
study, each subject’s scores on the three second- 
order factors were ranked so that the highest 
rank was always assigned to the subject who was 
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most emotionally expressive, most dependent, or most - 
intellectually oriented. 


Semantic Differential (SD) 


The semantic differential used in the study was 
prepared for use according to the procedures sug- 
gested by Osgood, Suci, and Tannenbaum (1957). 
Sixteen scales and 21 concepts were prepared in the 
form of a differential. The ordering of concepts, 
scales, and the polgrity of the adjectival pairs were 
left to a random process. 

Scales were selected for study on the basis of their 
promise from previous research that a factor analy: 
sis of the scales would yield judgments along the 
three dimensions of potency, activity, and evaluation, 
The 21 concepts were selected so that judgments 
could be made in a number of areas about intra- 
psychic states and interpersonal relationships. For 
purposes of this study, three of the concepts were 
selected for analysis with» regard to the subject's 
need structure. 


Preliminary Factor Analyses of the SD i 


A series of exploratory factor analyses were under 
taken with regard to the SD prior to the analysis. 
of the particular concepts to be reported in this: 
study. These analyses were conducted in order to 7 
drop uninterpretable scales and to determine whether d 
the factorial structure of the scales used in this 
study replicated the dimensiong reported by Osgood 1 
et al. (1957). All the factor analyses reported in 
this study were computed on the Michigan State] 
University CDC 3600 (DeJonge & Sim, 1964). In all 
cases, a principal factor solution was obtained and | 
factors were rotated, using a quartimax rotation] 
method, until the Kiel-Wrigley (1960) &riterion that 
three scales have their highest loading on a factor 
was no longer met (Harman, 1960). 

The first of these analyses consisted of factoring f 
the 16 study scales with both persons and concepts 
used as replications, The results of this analysis 
indicated that 4 of the scales could be dropped from 
study since they did no more than saturate 
evaluation dimension of the subjects’ judgments 
The remaining 12 scales were then refactored a 
before for male and female subjects, again using 
persons and concepts as replications, A three-factor 
solution was again achieved and the composition © 
the scales was noted to be in general accord with 
the findings of previous research. 

For both male and female subjects, eight scales u- 
equivocally defined an evaluation, activity, and p0- 
tency factor. Since these scales represented the fac- 
tors independent of particular concepts, their meaning 
was used to define the factors in later analyses Of 
specific concepts. The scales worthless-valuable, ditty- 
clean, helpful-obstructive, and cruel-kind clearly de- 
fined an evaluation dimension. The scales moving- 
still and excitable-calm defined the activity dimensio™ 
and humble-proud and tenacious-yielding defined 
potency dimension. The meaning of the remaining 
four scales, rash-cautious, hard-soft, complex-simpler 
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and light-heavy, shifted meaning dependent upon the 
sex of the respondent or the concept in question. 


Factor Analysis of Study Concepts 


Having established the general meaning of the 
scales employed by the SD in this study, the con- 
cepts “My Father" and “My Mother” were factor 
analyzed for male and *female subjects. In every 
case a three-factor solution was achieved in which 
there was a clearly defined evaluation, activity, and 
potency factor. The obtained factor solutions were 
used to develop factor scores for each subject. Since, 

ein this study, the subjects’ needs were to be com- 


' pared with their projection of activity and potency 


onto their parents, eonly those two of the three 
factors were separated out for examination. 

Factor scores were obtained by selecting those 
scales of each solution which represented the activ- 
ity and potency dimensions and weighting the scales 
in accordance with theif contribution to the factor 


(n3) 


solution according to the formula ———7 (Thomson, 
Tert 5 


1951). These scale weightings were then used as 
multipliers and were applied to each subject's stand- 
ardized scores on the semantic differential. This 
procedure succeeded in assigning to each subject a 
"potency" factor score and an "activity" factor score 
in z score form. To test whether the selected factor 
scores reconstituted the inflependent dimensions sug- 
gested by the orthogonal solution, the subjects' factor 
scores were intercorfelated. For males and females, 
respectively, the intercorrelation between the sub- 
ject's activity and potency factor scores on the two 
concepts in question were as follows: Father (r — .20, 
7 — 20), Mother (r —.14, r= 03). 

The distmputions of factor scores on each concept 
for the two factors in question were placed in rank- 
order form for males and females separately where 
the subject with the highest score on each factor 
was always given the highest rank for that factor. 
These ranks were dichotomized at the median. In 
that way, each subject was considered to have pro- 
jected either activity or passivity and potency or 

“impotency onto both parents. 


Grouping the Projection Data 


The subjects’ projection scores were grouped in 
several ways. The first groupings consisted of clas- 
sifying the subjects according to flow they attributed 
each trait (activity or potency) to both parents. In 
this analysis, a subject's projection of activity onto 
one parent was compared, for example, with his 
projection of the same trait to the other parent and 
he was placed in one of four groups as a result of 
this comparison; that is, the father and mother 
could both be seen as active, or as passive, or one 
parent could be seen as active and the other passive. 

The second set of comparisons consisted of group- 
ing the subjects according to how they projected 
both traits onto each parent. In this grouping, for 
example, a subject's projection of activity onto one 
of his parents was combined with his projection of 
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potency onto the same parent. These groupings again 
yielded four combinations. A subject could see a 
parent as potent and active, as potent and passive, 
as impotent and active, or as impotent and passive. 
The analogue between this type of procedure and 
analysis of variance designs has been noted by Gray 
(1959). 


Testing for Response Set 


Response set has been found to be an important 
variable to control in research studies where hy- 
potheses are generated about the attribution of traits. 
It was decided, therefore, to examine the possibility 
that the subjects in this study had a set toward 
viewing “strength” and “activity” as invariant 
human characteristics. If the subjects had such a 
set, interpretation of the data regarding any psycho- 
logical association between the subjects’ need struc- 
tures and their regard for their parents’ strength and 
activity would be inconclusive. 

To test for response set, the subjects’ self- 
descriptions on the SD were factor analyzed in the 
same way as the two study concepts described above. 
These factor solutions were then examined to deter- 
mine whether the rotated factor structures were 
similar to the composition of the factor solutions 
obtained from the subjects’ descriptions of their 
parents, 

In the case of the male subjects, the best solution 
consisted of a two-factor solution, the first factor 
being an evaluation-activity factor and the second 
factor an activity-potency factor. The male subjects 
clearly differentiated their self-descriptions from their 
descriptions of their parents. In other words, the 
three-factor solution obtained from the male sub- 
jects when asked to describe their parents indicated 
that the subjects could conceive of their parents as 
being strong without necessarily being active, but 
they considered their own strength and activity as 
being bound together. 

Although the males made this differentiation in 
regard to these concepts, the females did not. A 
comparison of the factor analyses of their self- 
description and their description of their parents 
indicated that the factor structures were similar. For 
female subjects, a further analysis of their responses 
was considered necessary to determine whether re- 
sponse set could account more parsimoniously for 
any study findings. 

One way of resolving this question was to deter- 
mine empirically whether the female subjects who 
saw one parent as active or strong would attribute 
the same characteristic to the other parent, This 
analysis was effected by testing the distribution of 
the traits as chi-square distributions in which the 
subject’s projection of the trait onto one parent was 
compared with his attribution of the same trait to 
the other parent. The results of these analyses indi- 
cated that the attribution of a trait to one parent 
could not be predicted from the attribution of the 
trait to the other parent; so response set as an 
interpretation of the findings was discounted. 
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RESULTS 


Need for Dependency versus Projection of 
Strength and Activity 


The test of the study propositions consisted 
of comparing the subjects’ projection scores 
with their rankings on the three second-order 
factors of the AI. The results reported in 
Table 1 are those in which comparisons be- 
tween the subjects’ dependency scores were 
related to the “potency” and “activity” that 
they attributed to their parents. 

A word needs to be said about the structure 
of Table 1, Along the left hand side of 
the table are the two major groupings within 
which the comparisons are to be made: the 


subject’s scores for the projection of the same 
trait onto both his parents (between parents 
by subject) and the subject's scores for the 
projection of both traits onto the same parent 
(within parent by subject). These compari- 
sons are then subdivided, respectively, into the 
particular trait and parent in question. It can 
be seen further that there are four categories 
within each of these subgroupings, each cate: 
gory being followed by the number of sub- 
jects within the sample that fell into that 
particular category. 

In addition to this information, the data 
are also presented in Table 1 which were used 
in making decisions about how to combine the 
ranks for making necessary comparisons. In 


> TABLE 1 


COMPARISON OF RANKINGS OF SUBJECTS’ DEPENDENCY SCORES WITH THEIR SCORES FOR PROJECTION OF 
“STRENGTH” AND “ACTIVITY” ONTO PARENTS 


Rankings of dependency scores 


Projection scores Males Females 
N | Obtained | Expected s N | Obtained | Expected 5 
Between parent by subject 
Potency-impotency 

Father potency and mother potency 25 905.5*| 1146.6 27 | 1371.5 | 1485.0 
Father potency and mother impotency | 20 | 951.0 | 900.9 27 | 1490.5 | 1485.0 
Fatherimpotency and mother potency | 20 | 951.5 | 900.9 26 | 1289.0 | 1430.0 
Father impotency and motherimpotency| 25 | 1287.0 | 1146.6 29 | 1844.08} 1595.0 

Total 90 | 4095.0 | 4095.0 | 2.09 | 109 | 5995.0 | 5995.0 |—1.71 

` Activity-passivity 

Father activity and mother activity 22 | 1084.0") 982.8 27 | 1349.5 | 1485.0 
Father activity and mother passivity 23 | 1172.5*| 1064.7 26 | 1251.0 | 1430.0 
Father passivity and mother activity 23 922.0 | 1064.7 29 | 1690.0* | 1595.0 
Father passivity and mother passivity | 22 | 916.5 | 982.8 27 | 1704.53 | 1485.0 

Total 90 | 4095.0 | 4095.0 |—1.69 | 109} 5995.0 | 5995.0 | 1.91% 

Within parents by subject 
Father wd : 

Father activity and potency 27 | 1280.5 | 1228.5 33 | 1656.5 | 1815.0 
Father activity and impotency 18 | 9760 | 819.0 20| 9440 | 1100.0 
Father passivity and potency 18 | 5760*| 819.0 £| 21| 1205.5» 
Father passivity and impotency 27 | 1262.5 | 1228.5 35 | 2189.08 

Total 90 | 4095.0 | 4095.0 | 2.45] 109 | 5995.0 | 5995.0 | 1.9!* 

Mother 

Mother activity and potency 26 | 1101.0 | 1187.6 27 | 13. 5.0 
Mother activity and SS, 19 | 905.02] 859.9 27 1650. 1488.0 
Mother passivity and potency 19 756.0 859.9 26 | 1328.0 | 1430.0 
Mother passivity and impotency 26 | 1333.0* | 1187.6 29 | 1675.0 | 1595.0 

Total 90 | 4095.0 | 4095.0 |—1.54 | 109 |:5995.0 | 5995.0 1.54 
PE d categories that were combined within a set for Mann-Whitney U test. 
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order to determine empirically how to com- 
bine categories so as to maximize the signifi- 
cance of the differences between categories, 
the theoretical sum of ranks for each of the 
categories within a sample of subjects was 
computed and compared with the actual sum 
of ranks obtained. Fer example, it can be 
noted that the sum of ranks for a sample of 
90 male subjects is equak to 4095.0. Since 
25 subjects attributed potency to both par- 
ents, the "expected" proportion of rankings 
within that category would be R — 1146.6. 
As it happens, however, the “obtained” sum 
of rankings for the sample was R — 905.5. 
Since this sum falls considerably below the 
"expected" sum of ranks, and since all other 
categories are abové the expected sum, this 
category was tested, using the Mann-Whitney 
U test (Siegel, 1956) against all others. * 
Looking first at the projection of potency 
by male subjects, it can be seen that the 
males who received the lowest dependency 
scores perceived both of their parents as being 
strong. For a sample of 25, a sum of ranks, 
R = 905.5, yields a z score of 2.09 which is 
significant as a tWo-tailed test at p < .04. 
Turning to the female subjects’ response 
patterns, we find a somewhat similar phe- 
nomenon operating. The greatest differentia- 
tion along the dependency dimension occurred 
when both the father and mother were seen 
as impotent. Impotence in both parents was 
associated with female subjects who received 
the highest need for dependency scores. For 
N — 29, a rank sum of 1844 yields a z of 
_ — 1.71 which is significant at p < .09. 
* It should be noted that in the case of both 
male and female subjects, the dependency 
variable was associated with strength that 
was perceived to reside in both parents rather 
than in one or the other. "This fact is more 
clearly illustrated in the data for male sub- 
jects than in the female subjects’ data. For 
the female subjects, the mother's contribution 
of strength seemed to outweigh that of the 
father as will be pointed out later. 

With regard to the projection of activity 
onto both parents by male subjects, the 
greatest differentiation was achieved by com- 
bining the categories of subjects who at- 
tributed activity versus passivity to the 
father rather than categories involving both 


parents or the mother. That is, the depend- 
ency variable was here related to an attribute 
uniquely perceived in the father. Further, it 
can be seen that subjects who perceived 
their fathers as active rather than passive had 
a greater need for dependency (z — —1.69, 
p < .09). 

For female subjects, it was again the father 
rather than the mother or both parents that 
contributed to the higher female subjects’ 
dependency needs. It is interesting to note, 
however, that it was the female subjects who 
perceived their fathers as passive who re- 
ceived the higher scores on dependency 
(z = 1.91, p < .06). This finding is the con- 
verse of the one reported immediately above 
for male subjects, 

The next question to be considered in 
Table 1 had to do with the relationship be- 
tween subjects’ dependency needs and the 
interaction of strength and activity within 
either parent. With regard to the male sub- 
jects, differentiation among categories oc- 
curred for male subjects when the father was 
seen as strong but passive. For N — 18, a 
sum of ranks, R — 576.0, yields a z score of 
2.45 which is significant at p < .01. Further, 
since the sum of ranks is low, it indicates that 
the subjects who perceived their fathers as 
strong and passive had fewer dependency 
needs, This analysis confirmed the fact that 
it was the interaction of the traits in the 
father rather than either strength or activity 
that contributed to the dependency needs in 
the male subjects. 

Regarding the male subjects! perceptions 
of their mothers, it was the strength of the 
mother rather than any interaction of strength 
and activity that contributed to the male sub- 
jects’ dependency. By combining the cate- 
gories involving strength perceived in their 
mothers, the categories are differentiated at 
$ €.12. Further, it can be seen that the 
higher ranks were attributed to the mothers 
who were seen as impotent. It can be con- 
cluded from this that the mother's strength 
was more important than her activity and it 
was in perceived maternal weakness that the 
greatest need for dependency occurred. 

Turning to the female subjects, it can be 
noted that the father's contribution to the 
girl's dependency was a function of his activ- 
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TABLE 2 


COMPARISON OF RANKINGS OF SUBJECTS EMOTIONAL EXPRESSIVENESS SCORES WITH THEIR SCORES FOR 
PROJECTION OF "STRENGTH AND ACTIVITY" ONTO PARENTS 


Witt J. 


MUELLER 


Rankings of emotional expressiveness scores 


Projection scores Males © 
N Obtained Expected 
Between parents by subject d 
Potency-impotency 
Father potency and mother potency 25 1205.0 1146.6 
Father potency and mother impotency 20 894.5 900.9 
Father impotency and mother potency 20 887.0 900.9 
Father impotency and mother impotency) 25 1108.5 1146.6 
Total 90 4095.0 ‘4095.0 
Activity-passivity 4 
Father activity and mother activity 22 1047.0* 982.8 
Father activity and mother passivity 123 837.0 1064.7 
Father passivity and mother activity 23 1268.5* 1064.7 
Father passivity and mother passivity 22 942.5 982.8 
Total 90 4095.0 4095.0 
Within parents by subject 
Father 
Father activity and potency 27 1080.0 1228.5 
Father activity and impotency 18 804.0 819.0 
Father passivity and potency 18 1019.5" 819.0 
Father passivity and impotency 27 1191.5 1228.5 
Total 90 4095.0 4095.0 
Mother 
Mother activity and potency 26 1319.5* 1187.6 
"Mother activity and impotency 19 996.0* 859.9 
Mother passivity and potency 19 712.5 859.9 
Mother passivity and impotency 26 1007.0 1187.6 
Total 90 4095.0 4095.0 


É cie categories that were combined within a set for Mann-Whitney U test. 
psd. 
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ity rather than potency. Since there was no 
interaction of traits, this finding simply repli- 
cates the earlier one that female subjects were 
more likely to have higher dependent needs 
if they saw their fathers as passive. With 
regard to their mothers, it is the mother’s 
strength rather than activity that the female 
draws on, The more impotent the mother is 
perceived, the greater is the female subjects 
dependency (1.54, ? < HAE 


Emotional Expressiveness versus Projection 
Scores 

The findings regarding the relationship be- 
tween the subjects’ emotional expressiveness 


scores and their projection of strength aml 
activity are not as extensive as those for the 
dependency scores, For female subjects, 
fact, the subjects’ emotional expressiven 
was not found to be related to the attributic 
of either trait nor was there any interacti 
effect within either parent. 

The data for male subjects are reported it 
Table 2. It can be seen that the activity 
the mother is correlated with male subj 
emotional expressiveness. The more active | 
mother was*perceived, the more emotion: 
expressive were the subjects (z= —2. 
p < .03). It can also be inferred from Tab! 
that the mother's activity took precedi 
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over the father's and it was the mother's 
activity rather than her own strength that 
contributed to the subjects’ expressiveness, 

By itself, the perceived strength of the 
parents is not related to subjects’ expressive- 
ness (z = —.61, p< .54) but, within the 
father, whén .the interaction between his 
strength and activity is combined, a signifi- 
cant relationship can be noted in that emo- 
tionally expressive male subjects perceived 
their fathers as strong but passive (z= 
—2.03, p < .04). , 


Intellectual Orientation versus Projection 
Scores ki 


The significant findings regarding the rela- 
tionship between the subjects’ intellectual 
orientation scores and their attribution scores 
aréreported in Table 3. With regard to the 
male subjects, only the activity of the mother 
was found to be related to the subjects’ intel- 
lectual orientation. It can be seen that the 
mother’s activity took precedence over the 
father’s activity and „her own strength, The 
more active the mother was perceived, the 
greater was the? male subjects’ intellectual 
orientation (z = —2.08, p < .04). 

Female subjects’ intellectual orientation 
was related to the father’s rather than the 
mother’s a activity and the father's activity 
took precedence over his strength, The more 
passive father was associated with female 


The findings can be summarized for easier 
interpretation. 


For male subjects: 


1. Male subjects who perceived both par- 
ents as being strong had fewer dependency 
needs (z = 2.09, p < .04). Perceived strength 
in both parents rather than in one or the other 
made the greatest differentiation among the 
dependency needs of the subjects. 

2. With regard to the perceived activity of 
the parents, the activity of the father took 
precedence over that of the mother, The sub- 
jects were more likely to be more dependent 
if their fathers were seen as active (z= 
— 1.69, p < .09). 

3, When the interaction of these variables 
was studied within each parent, it was found 
that it was an interaction between strength 
and activity in the father that made the 
unique contribution to a subject's dependency 
(z = 2.45, p < 01). The subjects who per- 
ceived their fathers as strong and passive had 
fewer dependency needs. 

4. Similarly, when interaction of needs was 
studied in mothers, the strength of the mother 
rather than her activity contributed to the 
subjects dependency, with subjects who per- 
ceived their mothers as being weak having 
the higher need for dependency (z = — 1.54, 
p<.12), 

5. Male subjects who perceived their moth- 


, subjects’ greater intellectual orientation ers as active received higher emotional ex- 
(2 = 1.52, ? € .13). pressiveness scores (z= —2.16, p< .03). 
TABLE 3 


COMPARISON OF RANKING OF SUBJECTS’ INTELLECTUAL ORIENTATION SCORES WITH THEIR SCORES FOR 
PROJECTION OF “STRENGTH” AND “ACTIVITY” ONTO PARENTS 


Rankings of intellectual orientation scores 


Projection scores d Males Females 
" N | Obtained | Expected s N | Obtained | Expected Li 
Between parents by subject 
Activity-passivit t 
Father activity. and mother activity | 22 | 1149.0 | 982.8 27 | 1415.0 | 1485.0 
Father activity and mother passivity | 23 | 926.0 | 1064.7 26 | 1248.5 | 1430.0 
Father passivity and mother activity | 23 | 1157.0* | 1064.7 29 | 1720.08 | 1595.0 
Father passivity and mother passivity] 22 | 863.0 982.8 27 | 1611.5" | 1485.0 
e 
Total 90 | 4095.0 | 4095.0 | —2.08**| 109 | 5995.0 | 5995.0 | 1.52% 
" ? Indicates categories that were combined within a set for Mann-Whitney U test. 
*5 «13. 
* p x .04. 
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The mother’s activity took precedence over 
the father’s, and her activity contributed 
more to the subjects expressiveness than did 
her potency. 

6. The interaction of strength and passiv- 
ity in the fathers of male subjects was related 
to more expressiveness (z= — 2.03, p< 
.04). 

7. The male subjects had higher intellec- 
tual orientation if mothers were active, The 
mother's activity took precedence over the 
father's activity and over her own strength 
(z = — 2.08, p < .04). This is similar to the 
finding in 5 above. 


For female subjects: 


1. Perceived weakness in both parents 
rather than in one or the other was associated 
with female subjects who had greater need for 
dependency (z = —171, p < .09). 

2. The activity of the father was more im- 
portant to the female subjects than that of the 
mother, Subjects who had more need to de- 
pend saw their fathers as passive (z = 1.91, 
$ < .06). 

3. With regard to the interaction of the 
traits within a parent, the activity of the 
father took precedence over his contribution 
of strength. Since there was no interaction of 
the traits within the father figure, this find- 
ing replicated 2 above. 

4. It was the mother’s strength rather than 
her activity that the female drew upon with 
regard to her need to depend. The weaker the 
mother, the greater the female dependency 
(2 = 1.54, p < .12). 

5. There was no relationship between emo- 
tional expressiveness and the attribution of 
traits. 

6. Perceived passivity in the father was re- 
lated to greater intellectual orientation in the 
female subjects, The converse occurred here 
as compared to the male subjects. The father's 
activity took precedence over the mother's 
and over the father's own contribution of 
strength (z = 1.52, p< 3). 


Discussion 


The findings regarding the relationship 
between the subjects’ dependency needs and 
their attribution of strength and activity 


onto their parents have added supporting 
data to identification theory. For male and 
female subjects, the need to depend was 
found to be significantly related to strength 
that was perceived to reside in both parents, 
Apparently perceiving one parent as strong, 
whether it is the like-sex or cross-sex parent, 
is not sufficient to reduce the subjects’ feel- 
ings of dependency. 

This finding not only supports the specula- 


tions about the importance of the familial in- 3 
teraction in the development of the person- ' 


ality structure, but it also iends credence to 
the contention that a supporting relationship. 
within the family constellation leads to ade- 
quate identification and subsequent better 
adjustment (Carlson, 1963; Slater, 1961). 

The data are also in accord with the posi- 
tioh (Gray, 1959; Lazowick, 1955) that the 
paternal intervention is of more importance 
to male adjustment than is the maternal 
identification for the female. Regarding this 
point, it can be recalled that both the father's 
strength and activity took precedence over 
the mother's in relation to how dependent 
the male was, On the other hand, it was the 
mother's strength but the father's activity 
that differentiated the female needs. 

There is an interesting contrast between 
the influence of the father's and. mother's 
activity on the subject's dependency needs. 
If directional differences are disregarded for 
a moment, it can be stated that the activity 
of the father contributed significantly to the 
strength of the dependency needs of the sub- 


ject, regardless of the sex of the respondent. + 
On the other hand, the activity»of the mother 


was not a critical factor in differentiating 
either the male or female subject's need to 
depend. Although the activity of the mother 
did influence the shaping of some of the other 
personality characteristics discussed below, it 
was the mother's strength and not her activ- 
ity that the subjects apparently found essen- 
tial for reduction of their dependency needs. 

It is also noteworthy that within the father 
it was the interaction of strength and activity 
that contributed most to differentiating the 
male subject's dependency needs. When the 
father was seen as strong but passive, the 
male subject was most likely to have fewer 
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dependency needs. In order to interpret the 
meaning of this finding, it is important to 
review the scales that contributed to the con- 
| cept of'a passive father. The passive father 
was one who was described on the semantic 
differential as calm, still, and light rather 
than excitable, moving, and heavy. Perhaps a 
more "adequate abstraction from these scales 
would be levelheadedness. Such an interpre- 
f tation of the data would be sensible since it 
implies that the strong but calm father is 
* “enough in control of his own affairs that he 
can afford to be receptive to the son's bids for 
dependency. One would expect that such re- 
- ceptivity would lead deVelopmentally to need 
reduction as was found here. 
: This interpretatiom of the data, however, 
y- confounds the findings that the highly de- 
` pendent females projected passivity onto 
fathers. Perhaps for the female subjects, the 
father's calmness was construed somehow to 
mean coolness and to create distance between 
daughter and father rather than receptivity. 
Additional work needs to be undertaken re- 
garding semantic differences for male and 
female subjects. e 
The major findings relating to the sub- 
ject’s needs for emotional expressiveness and 
intellectual orientation have to do with the 
male subjects responses. It was found that 
perceived activity on the part of the mother 
was the most important contributor to differ- 
entiating the male subjects emotional ex- 
pressiveness and also their needs for intel- 
lectual orientation. The more active the 
-mother, the higher the male subjects needs. 
Although deéscriptive data like those re- 
ported here do not permit the reconstruction 
of the sequential development of the two 
needs involved, a rather compelling hypo- 
thetical interpretation of the finding would 
be to look on the intéllectual orientation as a 
defense against the emotional expressiveness. 
It will be recalled that the scales defining 
passivity were interpreted more sensibly to 
mean levelheadedness. Within this interpreta- 
tion, the polar attribute, in this case the ac- 
tive mother, would be described as impulsive, 
# rash, or unpredictable. It would seem to fol- 
: low developmentally that the son may need to 
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defend against this maternal impulsivity by 
some such means as intellectualization. 

A. developmental interpretation of need 
structures is also suggested when the projec- 
tions of the female subject with strong de- 
pendency needs are compared with those of 
the subject with high intellectual orientation. 
It will be recalled that the female subjects 
who had the stronger dependency needs saw 
their fathers as passive, and it was suggested 
that this passivity may have warded off a 
warm reciprocal relationship conducive to 
need reduction. When this finding is coupled 
to the fact that the females who had the 
greatest intellectual orientation were also the 
ones who projected passivity onto their fath- 
ers, the interpretation can be extended, On 
the basis of these'data it could be hypothe- 
sized that intellectual orientation grows out 
of earlier defenses against the unresolved de- 
pendency needs. That is, if the father, as the 
most significant male figure in the develop- 
ment of the female child, is unable to provide 
the child with the kind of relationship in which 
she can learn to depend on a male, she would 
have learned several lessons. Among other 
things, she would have learned that it was 
wrong to depend and that she must rely on 
her own resources. As a consequence, the child 
may use intellectual defenses to assist her in 
overcontrolling as a way of warding off the 
feelings of deprivation. 
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ASSIGNMENT OF RESPONSIBILITY FOR 
AN ACCIDENT ' 


ELAINE WALSTER 


University of Minnesota 
. 
This experiment tested the proposition that the worse the consequences of an 
accidental occurrence, the greater the tendency of Ss to assign responsibility for 
the catastrophe to some appropriate person. The experiment also tested the 
specific ptoposition that an accident victim would be assigned increasing re- 
sponsibility for his accident as its severity increased. Data supported these 
2 hypotheses. There seemed to be 2 ways of judging the same behavior as more 
responsible for the accident when accidental consequences were severe: (a) 
Ss coufd perceive the responsible person as more careless when accidental con- 
sequences wege severe; (b) Ss could perceive the responsible person's behavior 
correctly, but apply stricter moral standards in judging the behavior when 
accidental consequences were severe. Data indicated that only the 2nd method 


of assigning responsibility was utilized by Ss. 


Peeple have no real control over many af 
the things that happen to them. Cars can fail 
to start in the morning; students can get 
measles just before a final exam; air and rail 
accidents can kill the traveler; and floods, 
plagues, and hurricanes can destroy entire 
communities, Control over all environmental 
events is impossible both because techniques 
for preventing some accidents are unknown 
and because precautionary steps may be im- 
practical considering the rarity of the occur- 
rence and the number of variables involved. 
We acknoWledge, then, that some kinds of 
accidents are bound to occur, and that these 
accidents could happen to anyone. And when 
we hear of an accident, for the most part we 
sympathize with the helpless victim of fate. 

Often, however, if we feel the accident is a 
‘serious one and we reflect on it at some 
length, we begin to have vague feelings that 
perhaps this accident was not beyond the 
victim's control. For example, the thought 
may cross our mind that tke flood victim 
should have had enough foresight to build 
his home further from the river; that the vic- 
tims of political persecution should have an- 
ticipated the inevitable and emigrated before 
real persecution began. Disasters are also 
sometimes judged by observers (and by the 
victims themselves) to be punishment for the 


1This investigation was supported in part by 
Grant MH 10192-01 from the National Institute of 
Mental Health, United States Public Health Service. ; 
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victim’s sinful lives (Freud, 1949; Rosen- 
man, 1956; Takashi, 1951) or the results of 
their practical ineptness (Wolfenstein, 1957). 
When the victim is not identified or seems 
blameless, we still probably have a tendency 
to wonder if someone could not have pre- 
vented the catastrophe. Although many dis- 
aster reports explicitly deny that victims and 
observers tend to blame others for disasters 
(Bucher, 1957), the reports themselves re- 
veal a tremendous number of attempts to 
assign responsibility to someone. Victims and 
observers ask: “Did the airline officials really 
inspect the airplanes which crashed?” 
(Bucher, 1957); “Were the buildings razed 
in the tornado of inferior materials?” (Perry 
& Perry, 1959); “Would the officials have 
decreased disaster damage if their warning 
and evacuation instructions had been more 
forceful?” (Clifford, 1956; Wallace, 1956). 
Do people commonly ask “Who is to 
blame?” when hearing of an accident? And if 
so, under what conditions is such a tendency 
to assign responsibility especially pronounced? 
We hypothesized that the tendency to try to 
assign responsibility to someone when we hear 
about an accident increases as the conse- 
quences of the accident become more serious. 
We reasoned: When we hear of a person 
who has suffered a small loss, it is easy to 
feel sympathy for the sufferer, attributing his 
misfortune to chance and acknowledging that 
unpleasant things like the accident can hap- 


74 ELAINE WALSTER 


pen to a person through no fault of his own. 
As the magnitude of the misfortune increases, 
however, it becomes more and more unpleas- 
ant to acknowledge that “this is the kind of 
a thing that could happen to anyone." Such 
an admission implies a catastrophe of similar 
magnitude could happen to you. If we can 
categorize a serious accident as in some way 
the victim's fault, it is reassuring. We then 
simply need to assure ourselves that we are a 
different kind of person from the victim, or 
that we would behave differently under simi- 
lar circumstances, and we feel protected from 
catastrophe. 

But what if it is not possible to discredit 
the victim? Does one also gain reassurance 
from attributing responsibility for the catas- 
trophe to someone else? Probably so. If a 
serious accident is seen as the consequence of 
an unpredictable set of circumstances, beyond 
anyone's control or anticipation, a person is 
forced to concede the catastrophe could hap- 
pen to him. If, however, he decides that the 
event was a predictable, controllable one, if 
he decides that someone was responsible for 
the unpleasant event, he should feel somewhat 
more able to avert such a disaster. He can 
protect himself by putting people like the 
ones responsible away—isolating them so they 
cannot cause calamities, or reforming them so 
they will not cause them. Or, he can simply 
assure himself that the people in contact with 
himself are different or that their behavior 
could be controlled by him. 


METHOD 
Experimental Design 


The following experiment was designed to test the 
hypothesis that the worse the consequences of an 
accidental event, the greater the tendency for people 
to assign responsibility for the accident to someone 
possibly responsible for the accident.? Included in 


2 Why have we limited ourselves to the situation 
where someone is possibly responsible for a catas- 
trophe when our rationale indicated simply that the 
desire to assign responsibility to someone increased 
directly with the severity of the experimental conse- 
quences? It must be recognized that we do not 
expect the subjects to surrender all standards of 
judgment and habits of reasonableness in order to 
satisfy their desires to assign responsibility., The 
tendency we propose will naturally have the strongest 
effect when the objective evidence as to the responsi- 
bility of our stimulus person is ambiguous. If the 
objective evidence becomes overwhelming either that 


- 


the design is a test of the proposition that the 
more serious the negative consequences the victim 
suffers, the more we feel that ke is in some way 
responsible for the punishment he received. 

To test these propositions subjects were asked 
rate the responsibility of a young man for an acci- 
dent. He was described as having taken reasonable 
safety precautions to avoid the accident. The severity 
of consequences of the accident were beyond the 
man’s control. One-half of the time the consequences; 
were inconsequential (as trivial as a dented fender) 
One-half the time the consequences were serious ( 
serious as demolishing a car or injuring a person), 


responsible person suffers: inconsequential damage, 
I; considerable damage, II þersons in addition 
the potentially responsible person suffer: inconse 
quential damage, III; considerable damage, IV. 

We predicted that more responsibility would be 
assigned to the careful young man when the conse: 
quences of the accident Were considerable (Condi 
tions II and IV) than when the consequences were 
inconsequential (Conditions I and III). Condition 
is the control group for Condition II; Condition IIT 
is the control group for Condition IV. No compari: 
sons between Conditions I-II and III-IV are 
planned. Conditions I-II and III-IV are, then, 
effect, two parallel experiments—Conditions III- 
testing generally whether or not more responsibili 
is assigned to a possibly^responsible person when 
accidental occurrence is serious than when it is not 
Conditions I-II testing specifically whether or nol 
increased responsibility will be assigned to the victim 
for a severe accident, even when the victim is the 
only one who suffers any negative consequences from 
the accident. We anticipated subjects might feel 
sympathy for the young man in Condition II. Imi 
spite of this we still expected that subjects would 
assign him greater responsibility for his suffering 
when it was considerable than when it was 
inconsequential. 

In this experiment we also wanted to investigate 
some of the techniques used in assigning responsibil- 
ity to someone for an accident. There seemed to be 
two possible ways of finding a person at fault: The 
person blamed for the mishap could be perceived 
as more careless under some conditions than others. 
Subjects could perceive the blamed person’s behavior 
identically in all conditions, but the severity of the 
standards by which subjects judged him could alter. 
Behavior which would normally be deemed accepta- 
ble could under some vonditions be judged 35 
inadequate or immoral. 


Procedure 


f Subjects were 44 women and 44 men from a 
introductory psychology course at the University | 
of Minnesota. 


the stimulus person is objectively responsible (0f 
not responsible) for an accident, the subject's desire | 
to assign responsibility can have little effect on 
perception of that pérson’s responsibility. 
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The.experimental situation was arranged so that 
when the subjects gave their opinions about the 
stimulus person they would be confident that the 
experimenter was accepting this information as in- 
dicative of the stimulus person's characteristics, The 
objective was to avoid making the subject fearful 
that his opinions were in some way telling the 
experimenter abdüt his pergopality. In order to avoid 
making the subjects feel like subjects, the experi- 
menter treated them as if they, were part of the 
research staff. Two subjects were scheduled for each 
hour. The experimental foom to which the subjects 
reported had books and tapes piled in disarray on 


:fhe tables. After both subjects arrived, they were 


told that instead of serving as subjects in a usual 
experiment, they would have a chance to actually 
help select the materials and procedures to be used 


“in testing a hypothesis, 


A hypothetical research project was then described 
to the subjects as an investigation of the effect of 


, extreme fear on a soldier’s ability to make accurate 


T 


evaluations of others. Previous “inconclusive” re- 


search in this area was described to the subjects 
_ and then a “research design" was explained. It was 


stated that soldiers would be listening to tape record- 
ings under either fear or control conditions—that is, 
either a few hours before the soldiers knew they 
were to engage in a mock battle (fear condition) 
or a few hours before they were to be given a 
pass into town (control condition). These record- 
ings would contain Wescriptions of a real person. 
Soldiers would then be asked to evaluate the person 
described and give their hunches as to how the 


: person described would react in some situation not 


described on the tape. By comparing hunches with 
what in fact happened, soldiers’ accuracy under 
fear-arousing “conditions could be estimated. 

At this point the subjects’ part in the selection 
of research materials was explained. Subjects were 
told first to help in the selection of appropriate 
tapes. The experimenter showed the subjects how 
the tape recorder worked and then told the subjects 
she wanted each of them to listen to one of the 


. tapes being considered and to ascertain if they had 


any trouble undefstanding what the people were 
talking about. When they had heard the Whole thing 
the experimenter said she would then like them to 
fill out the same kind of questionnaire the soldiers 
would be given, expressing their gwn reactions to, 


and hunches about, the person described. The experi- . 


menter stated that the subjects’ reactions to ,the 
tape would allow her to eliminate any tapes that 
were misleading, too hard, or too easy. Subjects 
were also asked to comment on the intelligibility of 
the questionnaire. 

The preceding rationale was designed to encourage 
the subjects to be frank and open in expressing their 
feelings about the young man described on the tape. 

Each subject still had to be assigned to an experi- 
mental condition, Random assignment was.made in 


' the following way. After demonstrating the tape 


recorder to the subjects (Sı and S2), the experimenter 
handed S; one of the four tapes saying “Why don't 


you get started on this tape?" The experimenter 
also gave him a questionnaire. The experimenter 
indicated that S» could listen to one of the remaining 
tapes on the recorder across the hall. The experi- 
menter took S; to this room, gave him a tape and 
a questionnaire, and indicated that he should return 
to the experimenter's office when he was finished so 
they could all talk about their reactions to the tapes. 
Tapes were assigned in a random manner to S; 
and Ss. 

Tapes I-IV were for the most part identical. The 
boy purportedly described on all tapes was “Lennie 
B." First his mother described Lennie. She indicated 
that he was a good boy; that he had a few neigh- 
borhood problems when he was very young, but that 
things were fine at the present time. She then 
discussed her feelings about child rearing. A school 
craft instructor then spoke. He said Lennie was a 
nice, enthusiastic person. The only negative thing 
mentioned about the boy was that he had not fin- 
ished orfe of his craft projects. The incomplete work 
was attributed to both a lack of skill and.a lack of 
money; Lennie was described as conscientious on 
the work he did complete. 

Then came the experimental communication. The 
accident was described. The description of Lennie’s 
behavior prior to the accident was identical on all 
tapes. A neighbor (who spoke in a casual, unemo- 
tional voice at all times) stated: 


that was late this summer. Lennie had just bought 
a car—it was about 6 years old or so. He and 
his buddy drove up to Duluth and parked at the 
top of this hill. Lennie’s buddy said Lennie did 
set the handbrake, but while they were gone the 
car started rolling. Some camp police who checked 
the car later said the brake cable was pretty badly 
rusted and must have broken, Anyway, the car 
started rolling .... 


On Tapes I and II identical possibilities were 
presented; Lennie might have damaged his own car 
either a great deal or not at all, depending upon 
whether or not has car was stopped against a tree 
stump a short way ahead. 


On Tape I the car was stopped and Lennie’s 
possession suffered no real damage. 

[Tape I]: If the car had run all the way down 
the hill it would have crashed into a big tree 
that’s at the bottom. But the car didn’t go very 
far at all.... It rolled against an old stump 
that was sticking out a little way into the street 
and stopped. The car just got a tiny dent in the 
front bumper and that’s all. Lennie didn’t have 
any insurance at the time. 


On Tape II the car was not stopped and thus the 
damage was serious. 

[Tape II]: The car might have rolled to a stop 
against an old stump that was sticking out a 
little way into the street just in front of where 
the car was parked, Instead, the car just missed 
it and went rolling all the way down the hill. 
The car really hit this big tree that’s at the bottom 
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and then kind of bounced off it onto some others. 
The car was completely totaled; the impact bent 
the frame, rocked the engine off its mounts, bent 
the drive shaft—just completely ruined the front 
end. Lennie didn't have any insurance at the time. 


Tt should be noted that on Tapes I and II Lennie's 
behavior is identical and so are the possible conse- 
quences; only the actual consequences differ. 


On Tapes III and IV we introduced the possibility 
that someone in addition to Lennie could be hurt 


by the accident. 


On Tape II, as on Tape I, the actual damage 
was inconsequential. 

[Tape III]: If the car had run all the way down 
the hill it would have crashed into this store 
that’s right at the bottom, and probably hurt 
either a kid or the grocer that were in the store. 
But the car didn't go very far at all. It rolled 
against an old stump that was sticking out a little 
way into the street and stopped. The car just got 
a tiny dent in the front bumper and that's all. 
Lennie didn't have any insurance at the time. 


On Tape IV, as on Tape II, the damage was 
tremendous: 

[Tape IV]: The car might have rolled to a stop 
against an old stump that was sticking out a little 
way into the street just in front of where the car 
was parked. Instead, the car just missed it and 
went rolling all the way down the hill. The car 
really crashed through the window of this store 
that's right at the bottom. It hit a kid that was 
standing at the counter and the grocer. The kid 
was just dazed a little, but the grocer was hurt 
pretty badly. He was in the hospital all last year. 
Lennie didn't have any insurance at the time. 


From this point on, the tapes were again identical. 
A high-school history teacher indicated that Lennie 
was an average student who tried to contribute to 
the class. Finally a neighbor spoke and explained he 
really did not know the boy. 

Aíter hearing one of tapes described above, the 
subjects expressed their opinions on the following 
questions: 


1. Do you feel that any responsibility should be 
assigned to Lennie for the automobile accident? 


Depending on which tape the subject received, one 
of the following phrases was inserted in parenthesis 
after the word "accident": Tapes I and III (in 
which his fender was dented); Tape II (in which 
his car was demolished); Tape IV (in which the 
child and the grocer were hurt). The subject could 
check one of four alternatives ranging from 
“Lennie was not at all responsible; the accident was 
completely beyond his control” to “Lennie was 
completely responsible for the accident.” 

How much carelessness the subjects attributed to 
Lennie was assessed in the next section of the 
questionnaire: 
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We'd like your "hunches" about several 
not supplied on the tape. 

1. Lennie owned a car for five months befo 
the accident occurred. Make as good a gues 
you can whether or not he ever had his brak 
checked during that period. (A 15-point scale 
provided which ranged from 1, “I’m extremely 8 
he had a brake check,” to 15, “Pin extremely su 
he did not have a brake check.”) J 

2. Lennie’s friend told their neighbor that L 
nie pulled the handbrake before parking on ti 
hill. How convinced are you that Lennie in 
did so? (On the scale provided for this ques! 
1 indicated “Extremely sure he pulled the ha 
brake, and 15, “Extremely, sure he did not p 
the handbrake.") 

3. Do you think tbat Lennie turned his wl 
toward the curb before parking on the hill? Ç 
subject could check one of four alternativi 
answer to his question ranging from 1, “He 
ably turned his wheels toward the curb as f 
possible,” to 4, “He probably did not turno 

' wheels toward the curb at all.”) f 


The moral standards the subject professed, í 
by which he presumably judged Lennie, were 
sessed in the third section of the questionna 
This section was headed: 

k 


We are interested in your personal convict 
in these questions. 
4. How often is a person, “morally responsi 
for having his brakes and other safety de 
checked? (The subject then had his choice 
either filling in a blank stating “Every - 
months (years)” or checking a box stating 
person is not morally responsible for having. 
brakes and safety devices checked.) 
Although a record was kept of the num 
of months the subject indicated, answers | 
recorded as follows: 1 was assigned to replies 
“A person is mot morally responsible." This 
the most lenient standard possible. The numbe 
was assigned for replies of “every 2-23 years.” í 
most severe standard possible (“everyday-€V er 
months”) was assigned a score of 6. f 
5. Do you feel it is “morally wrong” not 
have automobile insurance? (The subjects C0 
check one of four alternatives, ranging from 
“A person is not morally responsible for h 
insurance,” tov4, “It is extremely wrong nOL 
have insurance.”) 


The last question on the questionnaire aske 
subjects how much they liked Lennie. 


RESULTS 


Our primary interest is in whether oF 
subjects judged identical behavior on 4 | 


3 Tn all conditions Lennie was rated almost id 
cally on this filler question. (Condition means 
from 7.3 to 7.5 on a 15-point scale ranging W 
“Disliked him extremely much" to “Liked him 
tremely much.") j 
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TABLE 1 


MEAN RESPONSIBILITY ASSIGNED FOR THE ACCIDENT AND THE TECHNIQUES FOR ASSIGNING 
RESPONSIBILITY BY CONDITION 


Only driver punished Others also punished 
. Mild consequences |Severe consequences | Mild consequences | Severe consequences 
———— SN i clc ire 
Amount of responsibility assigned* 2.5 3.0 2.6 3.2 
Subjects’ “personal convictions”» * 
Frequency of brake checks 32 3.7 3.8 4,2 
Responsibility for insurance 24 2.5 2.5 3.1 
,* Strictness index (5.5) (6.2) (6.5) (7.3) 
Carelessness imputed tp driver? 
Failure to check brakes 10.5 11.1 119 10.9 
Failure to pull handbrake , 4.1 4.7 5.1 4.8 
Failure to turn wheels 3.5 3.7 3.5 3.6 
Carelessness index | (25.0) (26.9) (27.4) (26.6) 
^ The higher the mean, the more res ibility assigned to Lennie. . 
b The higher the mean, the stricter the moral code professed by subjects. 


* A strictness index could be computed for only 72 of the 88 subjects since only 72 students answered both the brake sand in- 


surancé@ questions. (The brake check question was rewritten 
d The higher the mean, the more careless Lennie is rated. 


son’s part as more responsible for an accident 
when the accident was severe than when the 
accident was inconsequential. The results pre- 
sented in Table 1 indicate clearly that judg- 
ments are dependent upon the severity of 
consequences, 

When told that Lennie might have demol- 
ished his car or might Have hit the grocery 
store and the store patrons if he had not been 
barely stopped by the tree stump, subjects 
rated his responsibility for the accident at 
2.5 and 2.6, respectively. They felt Lennie 
was somewhere between “only slightly re- 
sponsible” and “somewhat responsible” for 
the accident, on the average. When he was 
not lucky enough to stop against the stump 

* the responsibility assigned was 3.0 if his car 
was demolished and 3.2 if the grocer was hit 
and the grocer’s store damaged. On the aver- 
age, Lennie is judged slightly more than 
“somewhat responsible" but less than “very 
much responsible" for,the accident when the 
consequences were severe. 

Significantly more responsibility is assigned 
to Lennie for the severe accidents than for 
mild ones (F = 8.73, df = 1/84, p < 01). 
This is true even when the accidents in which 
Lennie alone suffers and the accidents in 
which others suffer also are analyzed sepa- 
rately. More responsibility is assigned to 
Lennie by subjects hearing Tape II than by 
those hearing Tape I (¢ = 2.2, p < .05, two- 


after the first 16 subjects experienced difficulty in answering it. 


tailed), and more by subjects hearing Tape 
IV than by those hearing Tape III (¢ = 2.0, 
p < .06, two-tailed). 

When the above data are anaylzed in 
greater detail, a peculiar finding appears, The 
data from men and women subjects were 
tabulated separately. Early in the experiment 
it became apparent that although the men 
judged Lennie as considerably more respon- 
sible when he hit the grocer (M = 3.4) than 
when he might have done so (M = 2.3), 
women rated him equally responsible in Con- 
ditions III and IV (M = 3.0). Sex of subject 
seemed to interact with the severity of the 
accident in Conditions III and IV in deter- 
mining the amount of responsibility subjects 
assigned to Lennie (F= 4.25, df = 1/40, 
p< 05). 

In addition, this means that women were 
judging Lennie significantly more responsible 
for the accident in which he dented his fender 
if the neighbor mentioned the possibility that 
he could have hit the grocer and a child 
(M — 3.0) than if the neighbor only men- 
tioned the possibility that Lennie might have 
destroyed his own car (M — 2.4, t — 4.39, df 
= 21, p< .001). Men did not assign more 
responsibility to Lennie as the possible conse- 
quences of the accident were increased. Why 
women assign so much responsibility to Len- 
nie when he could have hit a person and no 


78 ELAINE WALSTER 


more responsibility to him when he in fact 
does hit someone is not clear. 

We can now begin to investigate how sub- 
jects go about assigning additional respon- 
sibility to “someone” when an accident is 
severe, 

1. Do subjects perceive the responsible 
person as more careless if the accident is 
severe than they would if consequences were 
inconsequential? 

A total carelessness measure was computed 
by adding together the three carelessness 
scores after variances of the three scores had 
been equalized. From an examination of 
Table 1 it does not seem that there is a 
general tendency to impute greater careless- 
ness to Lennie in Conditions IT and IV than 
in I and II—F — 1.06, df — 1/84, and this 
is not significant. 

When accidental consequences are severe 
(Conditions II and IV), Lennie is judged 
more responsible for the accident. And yet, 
it does not seem that this increased respon- 
sibility is a result of the imputation of 
greater carelessness to him. 

2. Do subjects judge the same behavior 
more strictly when the behavior results in 
serious consequences? 

Question 4* asked: *How often is a per- 
son ‘morally responsible’ for having his 
brakes and other safety devices checked?" 
Subjects judging Lennie in the severe acci- 
dent conditions indicated that a person was 
morally obligated to have a safety check 
much more often than did other subjects. 
Subjects judging Lennie in Condition I (when 
he might have demolished his car) indicated 
that a person was only obligated to have 
a safety check every 11.3 months (M on 
derived index — 3.2). When Lennie's car was 
demolished, however, subjects judged his be- 
havior by more severe personal standards— 
they felt a person was morally obligated to 
have a brake check every 9.0 months (M 
= 3.7). Similarly, the subjects rating Len- 
nie's behavior from Tape III required a 
safety check every 7.8 months (M — 3.8). 
In Condition IV, when Lennie's car actually 


4Data from only 72 subjects are available. The 
"moral responsibility for having brakes checked" 
question was rewritten after the first 16 subjects 
reported difficulty in answering the original question. 


hit others, subjects! standards became the 
most severe; the subjects felt one was obligated 
to have his car checked every 7.1 months 
(M — 4.2). Thus, the “moral convictions" 
subjects profess on Question 4 are stricter 
under severe than under mild accident con- 
ditions (F = 4.07, df= 1/68, p < .05). A 
slightly stronger moral censure was applied 
to people who did not own automobile insur- 
ance by those subjects judging Lennie under 
severe accident conditions than by subjects 
in Conditions I and III, but these differences 
were nonsignificant. 

An index of “strictness of the subjects’ 
standards" was computed by adding together 
subjects’ scores on Questions 4 and 5. As was 
predicted, the standards subjects profess (and |’ 
the standards by which they presumably 
judge Lennie) increase in strictness as the 
“accidental” consequences become more 
severe (F = 5.62, df = 1/68, p < .05). 


Discussion AND SUMMARY 


How much support, then, is there for the) 
hypothesis that the tendency to assign re- 
sponsibility to someone possibly responsible 
for an accident increases as the consequences” 
of the accident become more serious? 
proposition was clearly supported by Our 
data. Lennie was judged as more. responsible 
for the accident when consequences were 
severe than when consequences were trivial. | 

It seems, too, that the convictions subjects 
profess and the standards they presumably. 
use in judging Lennie are significantly mote 
severe when accidental consequences are seri- 
ous, In all conditions, Lennie was described 
as having taken identical safety precautions. 
There is no indication that subjects perceivi 
Lennie as having taken fewer safety precal 
tions in one cóndition than in another. But, 
the convictions subjects expressed concerning 
how careful one should be were harsher when 4 
accidental consequences were serious than 
when consequences were mild. | 

In the introduction it was suggested that 
it might even be reassuring to assign respo 
sibility for a severe accident to someone Wh? 
was not a victim of the catastrophe. To test 
this suggestion that a nonvictim will be J^ 
creasingly blamed for increasingly severe 
accidents, a third set of conditions would hav? _ 
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to be run, Conditions III-IV are mot a clear 
test of this suggestion, If it could be demon- 
strated that the only victim subjects per- 
ceived in Condition IV was the shopkeeper, 
then the increased responsibility assigned to 
Lennie in Condition IV would be a demon- 
stration of fhis phenomenon. Unfortunately, 
however, Lennie was also a minor victim in 
Condition IV. (His car must have been dam- 
aged in the accidertt, too.) Thus, there is no 
way to clearly determine whether increased 
responsibility will be assigned to a nonvictim 
for a serious accident from the data available. 
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A number of specific hypotheses about correlates of hypnotizability were tested. 
A sample of 25 Ss representative of the investigators’ special volunteer popula- 
tion was drawn. The criterion of hypnotizability used was the maximum 
hypnotic depth achieved in as many intensive hypnotic training sessions as 
E needed in order to feel confident that a stable plateau in the S's performance 
had been reached. Findings confirmed the hypotheses that hypnotizability 
could be predicted from a general propensity for unusual subjectiva hypnotic- 
like experiences, from attitudes and motivational factors specifically relating 
to hypnosis, and from postural sway, heat illusion, and vividness of mental 
imagery. In addition, with few exceptions the hypothesis was supported that 
there would be "only nmegligiblé relationships between hypnotizability and 
measures of personality. Defining hypnotizability as a plateau performance 
rather than as some briefer estimate was shówn to be cogent, 


Over the years, in the process of training 
large numbers of volunteer subjects for par- 
ticipation in hypnosis research, the writers 
have developed clinical impressions or in- 
formal hypotheses about psychological in- 
dices which might predict hypnotizability. 
The present investigation was designed to 
test these impressionistic hypotheses. While 
in intent a hypothesis-testing experiment, the 
study has been developed in psychometric 
form. This procedure was used because the 
hypotheses under test refer to correlates of 
hypnotizability and because coefficients of 
correlation are often descriptively useful as 
rough guides for further evaluations. 

In order to test the hypotheses appropri- 
ately it was felt essential to reproduce as 
accurately as possible the original conditions 


1A condensation of this report was presented at 
the Annual Convention of the American Psycho- 
logical Association, August 30, 1963, Philadelphia, 
Pennsylvania. The study was carried out while the 
writers were affiliated with Harvard Medical School, 
Massachusetts Mental Health Center. The work was 
supported in part by Contract AF 49 (638)-728 and 
Grant AF-AFOSR-88-66 from the Air Force Office 
of Scientific Research. 

? We wish to thank our co-workers, F. J. Evans, 
L. A. Gustafson, and Emily C. Orne for their helpful 
comments. Appreciation in this regard is also due 
to E. A. Cogen and U. Neisser. Statistical work was 
done in part at the Computation Center, Massachu- 
setts Institute of Technology. 
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under which the impressions had been 
evolved. In methodological terms this require 
ment meant that two key features had to bi 
included in the experimental design: thi 
sample drawn had to be representative of 
investigators! special population of volun 
subjects and the criterion of hypnotizabilil 
used in the study had to be equivalent 
what the investigators have meant opel 
tionally by the term hypnotizability in the 
everyday usage. i 


IMPRESSIONIST HYPOTHESES 


It has been the impression of the writer: 
that hypnotizability is correlated with o. 
two general types of psychological variable 
with attributes bearing a close relationshi 
to actual hypnotic performance, and with atti: 
tudes which are highly specific to hypnosis 
and to the hypriotic situation, such as ati ti- 
tudes toward entering hypnosis under the 
investigators’ given laboratory conditions 
Examples of attributes felt to be predictiyi 


of mild hypnotic effects—specifically in 
regard, the Heat Illusion and Postural Swa 
tests. Another is the propensity for unusuä 
subjective experiences as measured by life 
history reports of naturally occurring hyp 
notic-like experiences, 
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Beyond these few indices it is the impres- 
sion of the writers that hypnotizability does 
not correlate with any of the common dimen- 
sions of personality measurement such as 
hysteria, submissiveness, neuroticism, extra- 
version, social adjustment, impunitiveness, 
acqtüiescence tendenty, intelligence, sex, and 
so forth. It is hypothesized that correlations 
sometimes reported between hypnotizability 
and these various types of measures are a 
function of inadequate criteria of hypno- 
tizability, selective personal appeals of dif- 
ferent hypnotists, and other situation-specific 
factors. » 

For example, a professor of psychology 
might consistently &nd a positive correlation 
between intelligence and hypnotizability in 
hiseresearch samples. This consistent correla- 
tion may occur, however, only because this 
particular investigator’s prestige and personal 
manner selectively appeals more to his 
brighter subjects and tends to evoke greater 
resistance and hostility in his less bright 
subjects, ig 

Another investigator might discover a 
correlation between neuroticism and hypno- 
tizability but only because he expected to 
discover this correlation. Under the generic 
concept of demand characteristics Orne has 
shown tat the hypnotist’s expectations and 
the subjects’ perceptions of these expectations 
will subtly alter all hypnotic behavior (Orne, 
1959, 1962a, 1962b). Rosenthal’s (1963) 
studies of outcome expectations of the ex- 
perimenter have demonstrated the.. potency 
of these variables in several areas. It is 
plausible that the investigators’ initial hy- 
potheses are communicated to the subject 
and in association with situational influences 
elicit data confirming the jnvestigators’ pre- 
dictions. For example, since the present in- 
vestigators expected a correlation between 
propensity for naturally occurring hypnotic- 
like experiences and hypnotizability it is 
plausible that subtle cues in their behavior 
may unintentionally have contributed to the 
resulting correlation. 


HisrORICAL PERSPECTIVES 


The problem of determining correlates of 
hypnotizability first received serious theo- 


retical attention during the celebrated Nancy- 
Salpétriére controversy of the 1880s. Charcot's 
(1882) neurological methodology led him mis- 
takenly to believe that only persons con- 
stitutionally predisposed to hysteria could be 
hypnotized. To Charcot hypnotizability was 
associated with a specific pathological proc- 
ess. A much less restrictive viewpoint was 
adopted by Bernheim (1884). Marshaling the 
broad clinical experience of practitioners in 
the Nancy tradition Bernheim replied that all 
individuals had the capacity to manifest some 
degree of suggestibility under appropriate 
circumstances. To Bernheim hypnotizability 
was a normal and universal potentiality. Dif- 
ferences in responsiveness—impressionability 
he éalled it—were due to subtle resistances 
and varied habit patterns toward authority 
rather than to a lack of the underlying ca- 
pacity to respond. The hypotheses advanced 
in this report are closely congruent with 
Bernheim’s viewpoint. 

In the early 1930s correlates of hypnotiza- 
bility became a matter for empirical study 
rather than polemics as academic psycholo- 
gists developed standardized scales for mea- 
suring hypnotic performance, The psychologi- 
cal testing movement was well developed and 
quantitative methods for expressing relation- 
ships were in general use, 

Although seemingly promising at first, this 
line of psychometric inquiry after 35 years 
and over 50 studies is primarily a history of 
disappointments, Published findings are either 
initially negative or fail to be supported in 
further work. Because the findings have 
been conflicting, and because procedures of 
sampling and determinations of hypnotizabil- 
ity have been highly divergent and ambigu- 
ous, there appears no satisfactory method 
for drawing meaningful conclusions. It was 
against this background of empirical con- 
fusion that the investigators turned instead 
to their own impressions and laboratory con- 
ditions, For detailed reviews of the literature 
on correlates of hypnotizability see Barber 
(1964), Deckert and West (1963), and 
Weitzenhoffer (1953). The separate studies 
are discussed later in this report as relevant 
to the classification of tests used in this 
investigation, 
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SAMPLING PROCEDURE 


The selection procedure was designed to 
produce a sample representative of the in- 
vestigators’ special population of volunteer 
subjects. This special population is composed 
mostly of college student subjects who al- 
ready have had considerable exposure to hyp- 
notic training. About half of the individuals 
in this population are, moreover, at the two 
extreme ends of the continuum of hypnotic 
responsiveness—that is, the distribution is 
rectangular rather than Gaussian. This ab- 
normal distribution is produced because the 
majority of experiments on hypnosis in the 
laboratory require the use both of many 
highly responsive and unresponsive subjects. 
In other words, in the process of continually 
developing the volunteer subjects pool the 
investigators had strongly tended to expend 
their primary effort in locating as many indi- 
viduals as possible at the two extremes of 
hypnotic responsiveness. 


Subjects 


The sample was composed of 25 students from 
universities in the Boston area. The subjects were 
individuals interested in hypnotic experimentation, 
willing to participate in a lengthy series of psycho- 
logical testing with only token monetary payment. 
The subjects were obtained on a random basis from 
the available pool. In the years prior to the experi- 
ment, many hundreds of subjects had passed through 
the laboratory with variable amounts of hypnotic 
training and experimental participation. Thus, most 
of the individuals selected for inclusion in this study 
had already received extensive hypnotic training 
prior to the experiment; as a result some subjects 
were able to enter deep hypnosis early while others 
were unresponsive to hypnosis in repeated hypnotic 
training sessions. Only 6 of the 25 subjects selected 
had had no prior exposure to hypnotic training. 
The inclusion of these few inexperienced subjects in 
the sample was intended to reflect the fact that at 
the time of the study about a fourth of the labora- 
tory’s training sessions were devoted to initial 
hypnotic screenings, 


THE CRITERION OF HYPNOTIZABILITY 


The criterion of hypnotizability used in 
this study was equivalent to what the investi- 
gators have meant operationally by the term 
hypnotizability in their everyday usage. In 
most studies hypnotizability has been defined 
in terms of a single score on a limited test 


of hypnotic performance. The assumption 
made is that relative ratings of the subjects’ 
performance would not be greatly altered by 
additional hypnotic training. In the present 
study, however, a subject’s hypnotizability 
was defined as the maximum hynnotic depth 
achieved in as many intensive hypnotic train- 
ing sessions as the experimenter needed in 
order to feel confident that a stable plateau in 
the subject’s hypnotic performance had been 


reached. The controversy regarding “uni- . 


versal” hypnotizability remains unresolved; 
that is, whether or not with unlimited time 
and ingenuity everyone eventually could be 
profoundly hypnotized, Nevertheless all em- 
pirical workers agree that if apparently co- 
operative subjects are given skillful and in- 
tensive training, that most subjects most of 
the time rapidly reach a plateau in hypnotic 
performance after which no appreciable im- 
provement occurs regardless of the hypnotist, 
the methods used, or the amount of further 
training. 

Defining a subject's hypnotizability as his 
stable plateau in hypnotic performance means 
that two diagnostic estimates are necessary: 
a performance rating is needed of the actual 
maximum hypnotic depth which the subject 
achieves in a given session, and a judgment 
is needed indicating that the subjéct’s hyp- 
notic performance would be very unlikely to 
improve with additional training. 


Procedure 


In these hypnotic training sessions the examiner 
was allowed freedom to utilize any techniques which 
seemed appropriate and to explore clinically any 
issues which might then help maximize performance. 
All hypnotic sessions were administered by one of 
the investigators (MTO). To secure estimates of 
interjudge reliability another of the investigators 
(RES) observed all'of the sessions through a one- 
way mirror with audio arrangements. 

Both the experimenter and the observer inde- 
pendently rated the maximum depth achieved. These 
ratings were clinical diagnoses by experienced hypno- 
tists based upon both objective hypnotic behavior 
and the subjects’ report. For each of the subjects the 
experimenter eventually made the judgment that 
further improvement in hypnotic performance was 
highly unlikely. Performances in these final training 
sessions were classified into four categories: less than 
light, light, medium, and deep. For all inferential 
purposes these four categories are consistent with 
the understandings of these terms in common usage 
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and descriptively correspond to the major divisions 
of the Davis-Husband scale (Davis & Husband, 
1931). The two sets of final performance ratings 
were almost identical (r=.96) with virtually no 
mean difference. They were averaged to form a single 
criterion measure of hypnotizability. 

Tabulations of the subjects classified into each of 
the four categories of hypnotizability are presented 
in Pable 1. An approximately equal percentage of 
subjects fell into each of the four categories so that 
the distribution is roughly rectangular. This distribu- 
tion is not characteristic of the general population 
but rather represents the special sampling procedures 
of our laboratory. ` 


. 
SPECIFIC PREDICTIONS 


The psychological tests included in the 
present investigation are classified below into 
five. groups. The hypotheses tested are pre- 
sented as specific predictions. The tests are 
described in more detail in the section*on 
Description of Tests. 

I. Proneness for unusual subjective hyp- 
notic-like experiences. This group refers to the 
subjects! propensity to experience hypnotic- 
like experiences where external reality is not 
the major determinant of subjective reality. 
"This concept was measured by a set of Per- 
sonal Experiences Questionnaires. It was pre- 
dicted that these tests would correlate posi- 
tively with the criterion of hypnotizability. 

II. Attitudes and motivational factors 
specifically relating to hypnosis, This group 
refers to factors arising from the subjects’ 
attitudes and motives, persistent but poten- 
tially modifiable, which specifically relate to 
being hypnotized—that is, conscious and non- 
conscious attitudes toward hypnosis, precon- 
ceptions, fear’, motives, situational and inter- 
personal considerations directly relevant to 
entering hypnosis under the given conditions. 
These concepts were measured by three tests: 
Card 12M of the Thematic Apperception 
Test (Murray, 1943), Traits Regarding Hyp- 
nosis Inventory, and Background Index on 
Hypnosis. It was predicted that these tests 
would correlate positively with the criterion 
of hypnotizability. 

III. Personality attributes. This group re- 
fers to common paper-and-pencil» measures of 
stable and enduring personality attributes— 
for example, measures of hysteria, submis- 
siveness, neuroticism, extraversion, social ad- 


TABLE 1 


TABULATIONS OF THE SUBJECTS CLASSIFIED INTO 
EACH OF THE FOUR CATEGORIES OF 
HYYPNOTIZABILITY 


Hypnotizability ratings 


Less 
fi Me- 

i Light ates Deep | Total 
Percentage 28 24 16 32 100 
Frequency " 6 4 8 25 
Had prior evalu- 6 4 1 8 19 

ations 

New subjects 1 2 3 0 6 


justment, impunitiveness, etc. These concepts 
were measured by five tests: Minnesota 
Mukiphasic Personality Inventory (MMPI; 
Hathaway & McKinley, 1951), Minnesota 
Personality Scale (MPS; Darly & Mc- 
Namara, 1941), Rosenzweig (1938) Picture- 
Frustration (P-F) Study, Puzzles “Repres- 
sion" Test (Rosenzweig & Sarason, 1942), 
and Acquiescence Tendency (Couch & Ken- 
iston, 1960). It was predicted that these tests 
would mot appreciably correlate with the 
criterion of hypnotizability. 

IV. Subsidiary criteria of hypnotic per- 
formance. This group refers to measures of 
hypnotic performance other than the specific 
criterion of hypnotizability used in this inves- 
tigation. These measures were secured by two 
tests. Subjective Estimates of Percentage 
Depth and the Stanford Hypnotic Suscepti- 
bility Scale (SHSS), Forms A and B 
(Weitzenhoffer & Hilgard, 1959). It was pre- 
dicted that these tests would correlate posi- 
tively with the criterion of hypnotizability, 
but they were not intended to be considered 
as independent predictor variables, 

V. Miscellaneous. This group includes a 
number of tests: Postural Sway Test (Ey- 
senck & Furneaux, 1945), Heat Illusion Test 
(Eysenck & Furneaux, 1945), Vividness of 
Mental Imagery Questionnaire, and the 
Wechsler-Bellevue Intelligence Scale, Form 
II (Wechsler, 1946). In addition, the sub- 
jects’ sex also was used as a variable. It was 
predicted that postural sway, heat illusion, 
and mental imagery would correlate positively 
with the criterion of hypnotizability, but that 
intelligence and sex would not correlate. 
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TABLE 2 
ORDER OF TEST ADMINISTRATION 


1. Vividness of Mental Imagery Questionnaire 

2. Personal Experiences Questionnaire—Long Form 
3. Card 12M of the Thematic Apperception Test 
4. Personal Experiences Questionnaire—Short Form 
5. Postural Sway Test (I) 

6. Heat Illusion Test (I) 

7. Traits Regarding EE Inventory (I) 

8. Rosenzweig P-F Study 

9. Background Index on Hypnosis 

10. SHSS, Form A 

11. Hypnotic training and evaluation sessions 
12. MPS 
13. MMPI 

14. Puzzles “Repression” Test 

15. Postural Sway Test (II) 

16. Heat Illusion Test (II) 

17. Wechsler-Bellevue Intelligence Scale, Form II 
18. Traits Regarding Hypnosis Inventory (II) 

19. SHSS, Form B 

20. Over-All Agreement Score — » t3 
21. Subjective Estimates of Percentage Depth 


ORDER OF TEST ADMINISTRATION 


Order of test administration is presented in Table 
2. A few of the tests were repeated twice. Subjects 
were run individually through the sequence of test- 
ing in 3 days to 2 months, as their schedules per- 
mitted, The average was about 3 weeks. The time 
required for the subjects to complete the testing 
varied between 12-20 hours. The average was about 
16 hours. 


Description of Tests 


The tests included in the study are described 
below.3 

Propensity for unusual subjective hypnotic-like 
experiences. Personal Experiences Questionnaires: A 
relationship has been demonstrated between hypnotic 
performance and life-history reports on naturally 
occurring hypnotic-like experiences (Shor, 1960; 
Shor, Orne, & O’Connell, 1962). A number of in- 
vestigators have incorporated these materials into 
their own prediction studies. With one exception 
(Barber & Calverley, 1965) results have been favor- 
able (As & Lauer, 1962; As, O'Hara, & Munger, 
1962; Evans & Thorn, 1964; London, Cooper, & 
Johnson, 1962; Thorn, 1960).* 


8 Specimen forms and scoring instructions of new 
tests have been deposited with the American Docu- 
mentation Institute. Order Document No. 8607 from 
ADI Auxiliary Publications Project, Photoduplication 
Service, Library of Congress, Washington, D. C. 
20540. Remit in advance $3.00 for microfilm or $8.75 
for photocopies and make checks payable to: Chief, 
Photoduplication Service, Library of Congress. 

4This natural occurrence approach has parallels 
in earlier studies. See Barry, MacKinnon, and Mur- 
ray (1931), Sutcliffe (1958), White (1937b), and 
Williams (1952). 


Three varieties of Personal Experiences Question- 
naires were used in the present study. 5 
1. Personal Experiences Questionnaire—Long Form 
(PEQ-L): A 149-item paper-and-pencil self-report 
questionnaire was developed to elicit reports on a 
wide variety of hypnotic-like experiences occurring 
naturally in the normal course of living, independent 
of the use of special techniques, such as hypnosis, 
sensory-deprivation, drugs; eic. Two scoring systems 
were used: frequency—how often the subjects have 
had the experience described, and intensity—how 
vivid and profound was a subject's single most in- 
tense experience of it. Relevant quantitative scales 
were provided. (Also discussed in Shor et al, 1962.) : 

2. Imaginary playmates: At the end of the PEQ-L 
were appended a number of questions inquiring 
about the existence and apparent reality of imagi- 
nary playmates during childhood. These questions 
were scored as a separate unit, 

3, Personal — Experiences, Questionnaire—Short 
Form: In a prior publication normative data were 
presented on 44 items selected from the PEQ-L 
(Shor, 1960). The scoring system was based on 
simple occurrence, that is, subjects replied only on 
whether they had ever had the experiences described. 

Attitudes and motivational factors specifically re- 
lating to hypnosis. Card 12M of the Thematic Ap- 
perception Test (TAT): Four cards were selected 
from the TAT and administered with the standard 
modification for written responses. 

In order of administration, the four cards were: 
boy with violin, 1; two men standing, 7BM; man 
and reclining figure, 12M; and men reclining, 9BM. 
The third card in this set, 12M, has often been 
interpreted as depicting hypnosis, and it was taken 
as the "hypnosis" card which White (1937a) and 
later Sarason and Rosenzweig (1942) found elicited 
attitudes toward hypnosis which correlated with 
hypnotizability. A number of other investigators 
have also used Card 12M in their hypnosis research 
(Levitt, Lubin, & Brady, 1962; Levitt, Lubin, & 
Zuckerman, 1959; Schneck, 1951; Secter, 1961b; 
Ventur, Kransdorff, & Kline, 1956). 

Transcripts of the Card 12M protocols were coded 
and randomized. Four judges independently rated 
the protocols for estimates of hypnotizability, with- 
out instruction or restrictions as to the criteria of 
judgment to be applied.9 

Traits Regarding Hypnosis Inventory: A paper- 
and-pencil inventory was designed to elicit attitudes 


51t was discovered too “late that the original 
“hypnosis” card was not Card 12M. The original 
drawing, lost sight of during the early years of the 
TAT's standardization, was very similar in scenic 
content to the present Card 12M, but it had more 
hypnotic quality. A copy of the original card has 
since been secured from H. A. Murray. 

9 Two other scoring methods had been planned: 
the criteria which White's judges appeared to have 
used and Sarason and Rosenzweig's system. Both 
methods proved inapplicable to the present 
12M data. 
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toward hypnosis by means of a brief adjective check 
list. The objective was to develop a device yielding 
information comparable in principle with White's 
use of the “hypnosis” TAT card. The inventory was 
designed to preserve some of the features of a 
projective device but with objective scoring. The 
inventory had two parts. The first inquired about 
traits presumeg to characterize a good hypnotic sub- 
ject;e the second inquirede about traits presumed to 
characterize a good hypnotist. Scoring was the sum 
of favorable, plus not unfavorable responses. 

Background Index;on Hypnosis: A paper-and- 
pencil questionnaire was designed to inquire about 
the subjects’ knowledge, attitudes, and impressions 
about hypnosis. The questionnaire was composed of 
a number of separate sections.” 

1. Impressions of Percentage Pleasantness: Subjects 
were first required to describe their prior experiences 
as hypnotic subjects, their observations and reading 
about hypnosis, etc. They were then asked to esti- 
mate from all of their sources of factual information 
what percentage of the time the typical subject in 
hyphosis seemed to be enjoying himself and what 
percentage of the time hypnosis seemed unpleasant 
to him. , 

2. Circumstances of Agreeing to Participate in 
Hypnosis: Subjects were asked to describe the cir- 
cumstances under which they would volunteer to 
participate in hypnosis. Seven specific situations were 
cited covering a wide range of circumstances; for 
example, medical tgsearch, a fraternity party, etc. 
Scoring was based on the sum of agreements to 
participate. 

3, The Effects of Conditions on Initial Induction: 
A check list was provided in which subjects were 
asked to classify a series of 73 items in terms of 
how they*felt specific circumstances would effect 
initial hypnotic induction. Typical items were “being 
comfortable,” “a close friend of your choice watch- 
ing,” “just having failed an examination,” etc. A 
5-point scale was provided: (a) necessary to induce 
hypnosis, (6) favorable in inducing hypnosis, (c) 
neutral or uncertain, (d) unfavorable in inducing 
hypnosis, and (e) prevents hypnosis. Scoring was 
based on the number of extreme responses (sum- 
mation a + e). 

4. Conceptions of Hypnotic Depth: A:30-item 
check list was provided for subjects to classify their 
impressions of the depth of hypnosis required to first 
produce a series of described ‘phenomena. Typical 
items were: “the inability to open the eyelids when 
challenged to do so,” “the feeling of not wanting to 
resist the hypnotist’s suggestions,” “feeling as if 
your body were drifting through space,” and so 
forth. Eight of the items were grossly farfetched; 
for example, “the ability to accurately predict the 
future by going forward in time.” A 5-point scale 
was provided: (a) waking state, (b) light hypnosis, 

E 


1 For related approaches on measuring attitudes see 
Brightbill and Zamansky (1963), London et al. 
(1962), Melei and Hilgard (1964), and Rosenhan 
and Tomkins (1964). 
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(c) medium hypnosis, (d) deep hypnosis, and (e) 
does not happen. 

A scoring stencil was designed to yield three sepa- 
rate scores which may be briefly characterized as 
follows: extent of “magical” notions, extent of 
“skepticism,” and extent of agreement with the opin- 
ions of recognized authorities on hypnosis. 

Personality attributes, The MMPI and the MPS: 
The MMPI is a paper-and-pencil self-report per- 
sonality inventory of 550 items providing measures 
of nine basic psychiatric scales as well as many 
derived scales. The MPS is a paper-and-pencil selí- 
report personality inventory of 218 items providing 
five measures of individual and social adjustment. 
Unlike the MMPI, originally standardized to diag- 
nose common psychiatric classifications, the MPS 
has been standardized to be applicable to the fea- 
tures of personality adjustment most relevant to the 
general college population, 

Although it was not feasible to include every per- 
sonaljty inventory related to hypnotizability by one 
or another investigator, it was believed that the 
MMPI and the MPS together would be representa- 
tive of most measures available through this type of 
paper-and-pencil instrument.® 

Rosenzweig P-F Study and Puzzles “Repression” 
Test: In 1938 Rosenzweig hypothesized that hyp- 
notizability was positively associated with repres- 
sion as a preferred mechanism of defense and with 
impunitiveness as a characteristic type of immediate 
reaction to frustration, Evidence for this hypothesis 
was later reported (Rosenzweig & Sarason, 1942), in 
which impunitiveness was measured with the Rosen- 
zweig P-F Study (a paper-and-pencil inventory) and 
repression was measured by amount of negative 
Zeigarnik effect under anxiety-provoking circum- 
stances, In the repression test a set of 6-8 piece jig- 
saw puzzles was administered under the guise of an 
intelligence test in such a way that the subjects could 
successfully complete only half of the puzzles? A 


5See in this regard Barber (1956); Barber and 
Calverley (1964, in press); Barry et al. (1931); 
Cooper and Dana (1964); Das (1964); Faw and 
Wilcox (1958); Friedlander and Sarbin (1938); Hil- 
gard and Lauer (1962); Lang and Lazovik (1962); 
Levitt, Brady, and Lubin (1963); Messer, Hinckley, 
and Mosier (1938); Moore (1961); Sarbin (1950); 
Schulman and London (1963); Secter (1961a); 
Thorn (1960); Weitzenhoffer and  Weitzenhoffer 
(1958) ; White (1930) ; White (1937b, 1941) ; Wilcox 
and Faw (1959). A number of investigators have re- 
lated Rorschach test personality variables to hyp- 
notizability: Bergmann, Graham, and Levitt (1947) ; 
Brenman and Reichard (1943); Levine, Grassi, and 
Gerson (1943); Sarbin (1939); Sarbin and Madow 
(1942) ; and Schafer (1947). 

9 An intensive search failed to locate Rosenzweig 
and Sarason's original set of puzzles. New materials 
were thus compiled and carefully pretested. Care was 
taken to preserve and enhance the features of the 
test which Rosenzweig and Sarason had considered 
important, such as making the test appear to be 
a commercially available intelligence test. 
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greater percentage of recall of successfully completed 
items was taken as the index of repression. The re- 
pression study has been replicated, however, with 
null findings. (See Eysenck, 1947; Petrie, 1948. On 
Impunitiveness, see Willey, 1951.) 

Impunitiveness and repression scores were com- 
puted by the methods described by Rosenzweig and 
Sarason. Because it seemed that the computation of 
the impunitiveness score had a considerable non- 
objective component, all protocols were coded and 
scored blindly by three judges. Average interrater 
reliability was .78. The scores of the judges were 
then averaged. 

Acquiescence Tendency: Agreeing Response Set 
has been defined as the general tendency to agree 
with psychological test items regardless of their 
content. A number of investigators have hypothe- 
sized that this general tendency is a manifestation of 
a relatively stable personality characteristic to ac- 
quiesce to authority (Couch & Keniston, 1960). 
Theorists often have supposed that highly hypnotiz- 
able individuals possess this attribute. Two measures 
of acquiescence tendency (agreeing response set) 
were included in this study: Over-all Agreement 
Score (OAS; Couch & Keniston, 1960), and the 
summation of responses marked true on the MMPI. 

Subsidiary criteria of hypnotic performance. Sub- 
jective Estimates of Percentage Depth: During an 
interview conducted by one of the investigators 
(RES) at the end of the battery of testing, sub- 
jects were asked to estimate how deeply they had 
been hypnotized in the hypnotic training sessions in 
terms of their own, unaided understandings of the 
deepest hypnosis. A specific, percentage rating. scale 
was provided, a variant of procedures used by 
earlier investigators who had reported on subjective 
estimates of depth (Barry et al, 1931; Hatfield, 
1961; Israeli, 1953; LeCron, 1953; White, 1930). 

SHSS: The SHSS was administered twice to each 
subject by one of the investigators (DNO'C). Form 
A of the scale was always administered first, before 
the hypnotic training and evaluation sessions; Form 
B was always administered second, after the training 
sessions. 

Miscellaneous. Postural Sway Test and Heat Illu- 
sion Test: Eysenck (1947), Eysenck and Furneaux 
(1945), and Furneaux (1946, 1956), using hospital 
patient populations, reported multiple correlations 
between hypnotizability and the Postural Sway Test 
and the Heat Illusion Test of .96 and .92. The cor- 
relation between hypnotizability and the Postural 
Sway Test alone was reported as .73 and .64. The 
correlation between hypnotizability and the Heat 
Illusion Test alone, was reported as .51 and .59. 

The Postural Sway Test, standardized by Hull 
(1933), measures the amount of bodily sway in re- 
sponse to so-called waking suggestions during a speci- 
fied time period. 

The Heat Illusion Test was described as early as 
1893 by Scripture. The subject is asked to hold an 
electrical resistor which is slowly heated as he turns 
a calibrated knob, The subject is then asked to re- 
port when he first begins to feel heat. The indicator 


is then turned back to 0, and the procedure repeated. 
The second time, however, the current has been 
secretly turned off. 

The procedures for administering and scoring 
these two tests as described by Eysenck were repli- 
cated closely. Eysenck’s original recording of sway- 
ing suggestions was secured from Star Sound Re- 
cording Studios, Cavendish Square,.London, and 
used throughout. Wording’ of other procedures was 
kept identical. The only known modification was 
that a silent time-delay hidden switch was built into 
the Heat Illusion apparatus rather than a manual 
hidden switch. 


The Heat Illusion Test was concealed among a - 


series of five other “Perceptual and Physiological 
Tests,” which were not scored. The Postural Sway 
Test was administered at the end of the series. 

It had been the informal’ experience of the investi- 
gators that the Postural Sway Test and other simi- 
lar measures discriminated moderately well between 
those subjects who later respond positively to at 
least the easiest hypnotic suggestions and those sub- 
jects who later showed even fewer or no hypuotic 
responses at all. The investigators had never found 
such tests more than weak predictors, however, of 
ultimate hypnotic performance. Similarly, our im- 
pression was that the Heat Illusion Test had only 
a slight to moderate predictive value. Thus, it was 
predicted that the Postural Sway and Heat Illusion 
tests would correlate with hypnotizability, but that 
the multiple prediction would not be very high. 

Vividness of Mental Imagery Questionnaire: A 
paper-and-pencil questionnaire of 15 items was de- 
signed to inquire about the vividness of the mental 
imagery which the subjects report having generally 
available in the usual waking state. Subjects were 
asked to rate on a 7-point scale the clarity and 
vividness of their waking imagery in various sensory 
modalities. The questionnaire was a variation of 
Betts’ (1909) Imagery Questionnaire, which Sutcliffe 
(1958) had found differentiated his somnambules 
from nonsomnambules. McBain (1954) also had 
found a relationship between imagery and hypotiz- 
ability. The new questionnaire was evolved to pro- 
vide simpler items, more in keeping with the type of 
imagined experiences often required of hypnotic 
subjects. 

It was predicted that vividness of imagery would 
correlate with hypnotizability, but it was felt that the 


TABLE 3 
PERSONAL EXPERIENCES QUESTIONNAIRES (PEQ) 


tom umo E A T E o 


Hypnotizability 
se ae 
PEQ-Long Form: Frequency Ager 
PEQ-Long Form: Intensity .36* 
Imaginary playmates: Coded rating on 
their apparent reality Ager 
PEQ-Short Form: Simple occurrence AG 


Ss ee. a uM 


*p =.10. 
> = 05. 


d 
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' correlation might be artifactual. Shor (1962) has 


theorized that individuals with more vivid waking 
imagery have an uncontrolled advantage in the 
performance of those hypnotic phenomena involv- 
ing imagery, particularly in hallucinations. The crux 
of hypnotic fantasy, in Shor’s theoretical view, is 
not the vividness of the mental imagery as such 
but rather howecompletely the subjects believe in 
the reality of the hypnotic fantasy at the moment of 
the experience. The view is that even relatively 
shoddy imagery may appear phenomenally real to 
the subject at the moment of the experience provided 
his usual waking standards of comparisons have 


_ sufficiently faded. 


Wechsler-Bellevue Intelligence Scale, Form II: 
The Wechsler-Bellevué Intelligence Scale, Form II, 
was individually administered to the subjects by a 
trained research assistant. A' number of investigators 
have reported positive correlations of intelligence 
with hypnotizability (Barry et al., 1931; Curtis, 
1943; Davis & Husband, 1931; Friedlander & Sarbin, 


* 1938; Hull, 1933; White, 1930). 


Sex? Sex differences in hypnotizability favoring 
females have occasionally been reported (Davis & 
Husband, 1931; Friedlander & Sarbin, 1938; Hil- 
gard, Weitzenhoffer, & Gough, 1958; London et al., 
1962). 


RESULTS 


Correlations are reported for each predictor 
variable against the criterion of hypotizabil- 
ity. As pertinent, other correlations are also 
presented. Since directions of relationship 
often were predicted it seemed valuable also 
to report the .10 level.!^ 


Propensity for Unusual Subjective Hypnotic- 
like Experiences 


Personal Experiences Questionnaires. Cor- 
Telations between hypnotizability and the per- 
sonal experiences measures are presented in 
Table 3. Response consistency and internal 
consistency reliabilities of the Personal Ex- 
periences Questionnaires have already been 
reported as very high (.90 to«.96, p = .01, in 
Shor, 1960, and Shor et al., 1962).1* 


10In a very few instances of missing data sta- 
tistical significance was determined on the reduced 
sample size, 

11 Jn addition, London et al. (1962) found test- 
retest reliability of the PEQ-Short Form over a 
3-week interval to be .94, p=.01. As’ Experiences 
Inventory incorporated the instructions plus 18 items 
from the PEQ-Short Form along with 42 items de- 
vised independently; very high stability in answer 
percentages across samples were demonstrated for 
the common items (As et al., 1962). 
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TABLE 4 
Traits REGARDING Hypnosis INVENTORY 


Hypnotiz- Internal 
ability consistency 
First administration 
Good hypnotic subject Sie 82v 
Good hypnotist -00 68er 
Combined score .37* Bqerk 
Second administration 
Good hypnotic subject 68e Tee 
Good hypnotist Seek peek 
Combined score 69 T9 


Attitudes and Motivational Factors Specifi- 
cally Relating to Hypnosis 


Car 12M of the TAT. Correlations be- 
tween hypnotizability and the four judges’ 
blind ratings were .23; .68, p = .01; .13 and 
44, p = .05. The correlation with the sum- 
mation ranks of the judges’ ratings was .58, 
$= 01. 

Traits Regarding Hypnosis Inventory. Cor- 
relations between hypnotizability and the 
two administrations of the inventory and in- 
ternal consistency reliabilities (split-halves, 
Spearman-Brown) are presented in Table 4. 

Background Index on Hypnosis. Correla- 
tions between hypnotizability and the index 
are presented in Table 5. Only one of the six 
comparisons achieved significance.!? 


1?]t became apparent even before the data were 
fully gathered that the background index had a seri- 
ous defect in test construction. The approach used 
was to elicit attitudes toward hypnosis by phras- 
ing questions in the factual format of college exami- 
nations. The subjects’ replies, however, so strongly 
tended to reflect the enlightened skepticism of the 


TABLE 5 
BACKGROUND INDEX ON Hypnosis 


Hypnotizability 


Impressions of percentage of pleasant- 45 
ness 

Sum of agreements to participate 24 

Conditions of initial induction: Sum of B 
extreme responses 

Extent of magical" notions 23 

Extent of “skepticism” —.36* 

Extent of agreement with the opinions 25 


of recognized authorities on hypnosis 


*$-.10. 
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TABLE 6 
MINNESOTA MULTIPHASIC PERSONALITY INVENTORY 


Hypnotizability 


Hypochondriasis .08 
Depression 04 
Conversion Hysteria 16 
Psychopathic Deviate +18 
Correction 20 
Masculinity-Femininity —.27 
Paranoia 22 
Psychasthenia —.20 
Schizophrenia —.08 
Hypomania —.17 
Social Introversion —.24 
Cannot Say score —44 
Lie Ager 
Validity 04 
Sum basic clinical scales —.09 
Welsh's First Factor —.10 
Welsh's Second Factor 45 
Ego Strength —02, 
Dependency —45 
Dominance .29 
Responsibility 32 
Prejudice —.10 
Social Status —AT 
Role Playing .08 
Control —.23 
Summation True —.29 
Taylor’s Manifest Anxiety —.05 
M*$ = 05. 


Personality Attributes 


MMPI. Correlations with hypnotizability 
are presented in Table 6. Out of 27 basic and 
derivative scales only the correlation with the 
Lie scale was statistically significant, Except 
for Responsibility, none other had a coeffi- 
cient larger than .30. By any standards of 
multiple probabilities findings in the table 
were null.!* 

MPS. Correlations with hypnotizability are 
presented in Table 7; all were negligible. 

Rosenzweig P-F Study and Puzzles “Re- 
pression” Test, The correlations between hyp- 


college students’ subculture that personal attitudes 
seemed neglected. See London (1961) for a similar 
observation. 

18 Furneaux and Gibson (1961) and Das (1964) 
have reported negative correlations between hypno- 
tizability and the Lie scale of the Maudsley Person- 
ality Inventory, The measurement operations of the 
two scales are not similar, however. For other 
investigations on hypnotizability with the Maudsley 
Personality Inventory see Cooper and Dana (1964), 
Evans (1963), Furneaux (1961), Hilgard and Bentler 
(1963), Lang and Lazovik (1962), and Thorn 
(1960). 


notizability and the two triadic hypothesis ^ 
variables were negligible (with impunitive- . 
ness, .27; with “repression,” — .18). The 
correlation of the two predictors was .20. 

Acquiescence Tendency. Correlations be- 
tween hypnotizability and the two measures 
of acquiescence tendency (agreeing response | 
set) were both — .27, The correlation between 
the two acquiescence measures was .62, p= ., 
01. ^ 


Subsidiary Criteria of Hypnotic Performance 


Subjective Estimates of Percentage Depth. 
The correlation between hypnotizability and 
the percentage estimates was .74, p = .01. * 

SHSS. The correlation between hypnotiz | 
ability and Form A of the SHSS was .75, 9. ' 
= 01; the correlation with Form B was .93, w 
$ = 01. This increase is statistically signifi 
cant (p < .005). It will be recalled that Form 
A was administered before the hypnotic! 
training sessions; Form B was administered 
after their completion. When administered | 
under comparable conditions, Forms A and) 
B have been shown to be normative equiva- L 
lents (Hilgard, Weitzenhoffer, Landes, &- 
Moore, 1961; Weitzenhoffer & Hilgard, 1959). | 


Miscellaneous | 


Postural Sway Test and Heat Illusion Tesh- 
The correlations between hypnotizability and . 
the two administrations of the Postural Sway. 
Test were .32 and .37, respectively, p = .10; | 
were .36, p = 10, and .02, for the Heat Illu- 
sion Test; and were .47, p = .05, and .38, P 
= .10, for the multiple predictors. Intercof |, 
relations of the two predictors were .03 fot 
the first administration and — .08 for the 
second administration, The test-retest reli- | 
ability of the Postural Sway Test was .84, Ê 


TABLE 7 h 
MINNESOTA PERSONALITY SCALE 


Hypnotizability 


I. Morale —.09 

II. Social Adjustment A7 
III. Family-Relations .08 
IV. Emotionality —.08 

V. Economic Conservatism —.11 R 


Total scores (I-V) .04 
RE c AN SO aaa 
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TABLE 8 
GENERAL SUMMARY OF RESULTS ON HYPOTHESES 


Classes and tests 


I. Propensity for unusual subjective “hypnotic-like” 


experiences M 
Personal Experiences Questionnaires 


II. Attitudes and motivational factors specifically relating to 


hypnosis 

Card 12M of the TAT 

Traits Regarding Hypnosis Inventory 
Background Index on Hypnosis 

TII. Personality attributes 

MMPI ; . 

MPS 

Rosenzweig P-F Study 

Puzzles “Repressiop” Test 

Acquiescence Tendency 

- IV. Subsidiary criteria of hypnotic performance 
»Subjective Estimates of Percentage Depth 
SHSS, Forms A and B 
V. Miscellaneous 

Postural Sway Test 

Heat Illusion Test 

Vividness of Mental Imagery Questionnaire 
Wechsler-Bellevue Intelligence Scale, Form II 
Sex (females as higher) 

. 
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Vogue 
relationships Eendency 
Pre- | Ob- hypotheses 
dicted|served, 
+ | + | Low to moderate Confirm 
+ | + | Negligible to moderate | Confirm 
+ | + | Low to moderate Confirm 
+ | O | Generally negligible | Reject 
0 0 | Generally negligible | Confirm 
0 0 Negligible Confirm 
0 0 | Negligible Confirm 
0 0 | Negligible Confirm 
+0 0 | Negligible Confirm 
+ | + | High Not apply 
+ | + | High to very high Not apply 
+ | + | Low Confirm 
+ | + | Negligible to low Confirm 
dx oderate Confirm 
0 — | Moderate Reject 
0 + | Moderate Reject 


Note.—Directions: d$ = positive, — = negative, O = negligible. 


— 01, of the Heat Illusion Test, .58, p= 
01.4 

Vividness of Mental Imagery Questionnaire 
(VMI). The correlation between hypnotiz- 
ability and the VMI was .56, p=.01. In- 
ternal consistency reliability ^ (odd-even, 
Spearman-Brown) for the VMI was .91, p= 
01, for the first administration and .93, p = 
.01, for the second. 
. Wechsler-Bellevue Intelligence Scale, Form 

‘II. The correlation between hypnotizability 

and intelligence was — .50, p = .05. 

Sex, The correlation (point biserial) be- 
tween hypnotizability and sex was .46, p= 
.05, with females the more hypnotizable. 


SUMMARY or RESULTS ON HYPOTHESES 


A general summary is given in Table 8 
comparing results with the initial hypotheses 
under test. Tests are arranged and classified 
in rows. The predicted and actually observed 
directions of relationships are noted in the 
second and third columns of the table. Ob- 

14 Minor alterations in procedure may improve the 
Heat Illusion Test’s reliability and consequently its 
predictive power (Furneaux, 1964). 


served strengths of the relationships are de- 
scribed verbally in the fourth column.'5 Indi- 
cated in the final column is whether the find- 
ings tend to confirm or reject the initial hy- 
potheses. 

Most of the hypotheses were supported. 
The Background Index was the only unsuc- 
cessful prediction of a positive relationship. 
Regarding predictions of negligible relation- 
ships, two unexpected significant correlations 
with hypnotizability were discovered, intelli- 
gence and sex.'^ It should be noted that con- 
trary to previous findings, the observed rela- 
tionship between intelligence and hypnotiz- 
ability was negative in direction. 

Except for the Subsidiary Criteria of Hyp- 
notic Performance (which were not distinct 
predictor variables) positive correlations 


15'These descriptions are based on the convention 
of general verbal nomenclature for correlations sug- 
gested by Guilford (1956, p. 145): .00-.20 slight, 
20-40 low, 40-.70 moderate, .70-.90 high, .90-1.00 
very high. Insignificant correlations are considered 
negligible. 

16 It could be argued that a third unexpected sig- 
nificant correlation was found for the Lie scale of 
the MMPI. 
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TABLE 9 
INTERCORRELATIONS OF HYPNOTIZABILITY AND THE Five COMPOSITE PREDICTOR VARIABLES 


eypnotizenilty 
I. Personal RUNDE Questionnaires—Long Form 
(sum of ranks of both scores) 
II. Card 12M of the TAT (sum of ranks of judges' 


ratings) 


III. Traits regarding hypnosis inventory (sum of totals) 


IV. Postural Sway Test (sum of totals) 
V. Heat Illusion Test (sum of totals) 


oh? = 
»*$ = 05, 
** b = OL, 


with hypnotizability had been predicted for 
seven types of test. These seven were: Per- 
sonal Experiences Questionnaires, Card 12M 
of the TAT, Traits Regarding Hypnosis In- 
ventory, Background Index on Hypnosis, 


Five Composite Predictors and Average Composite 


‘Ir 


IYL ƏR JO WZI prep 


so2uenoedxg [euosJed *'I 
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Postural Sway Test, Heat Illusion Test, an 
Vividness of Mental Imagery Questionnaire. 

To provide a convenient summary index, 
multiple correlation has been computed 
tween hypnotizability and five of these 
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Fic. 1. Profile of mean composite T scores, 


a 


CORRELATES OF HYPNOTIZABILITY 91 


seven." The matrix is presented in Table 9, 
For stable scores, composites were used, as in- 
dicated in the table. The hypnotizability cri- 
terion had already previously been computed 
as a composite of both examiner's and ob- 
server's final performance ratings. The mul- 
tiple correlation is .77? p = .01.8 

By transforming all composite scores into 
the same unitage (Anderson & Barnhart, 
1959), a profile chart has been constructed to 
„describe, these relationships further. Mean 
composite T scores are presented in Figure 1 
for each of the four categories of hypnotiz- 
„ability, Averages of T score means are also 
presented. There are only a few minor incon- 
sistencies between hypnotizability and the 
„relative magnitudes of the T score means. 
These, inconsistencies are eliminated in the 
final averages, 


Discussion 


The hypotheses tested were generally sup- 
ported as evidenced bythe significance level 
of the separate correlations and the summary 
of results. Findings’ confirmed that hypnotiz- 
ability could be predicted from a general 
propensity for unusual subjective hypnotic- 
like experiences, from attitudes and motiva- 
tional factors specifically relating to hypnosis, 
and from postural sway, heat illusion, and 
vividness of mental imagery. In addition, the 
hypothesis was supported that there would be 
only negligible relationships between hypno- 
tizability and the measures of personality 
usually employed, It is concluded that the 
‘investigators’ impressions about correlates of 
hypnotizability were generally confirmed with 
the two exceptions of intelligence and sex. 


17 The Background Index on Hypnosis was ex- 
cluded from this comparison because it was unsuc- 
cessful. The successful Vividness of Mental Imagery 
Questionnaire was excluded because the investigators 
had suspected that the relationship might be arti- 
factual. 


18 The multiple correlation was computed stepwise, , 


entering the five compositors in order of their predic- 
tiveness. As noted, the best predictor, thé Traits 
Inventory, correlated .58 with criterion. Adding Card 
12M TAT to the predictor yielded a mfltiple cor- 
relation of .70. Entering the PEQ further inflated 
the value to .77. The remaining two predictors, 
Postural Sway and Heat Illusion, raised the co- 
efficient only in the third decimal place. 


It should be emphasized that the magni- 
tudes of the reported correlations must be in- 
terpreted in the light of the special popula- 
tion sampled and the small sample size, The 
study was conceived to test specific hypoth- 
eses about correlations between hypnotizabil- 
ity and other parameters in a salient popula- 
tion. Because the hypotheses being tested refer 
to correlates of hypnotizability and because 
coefficients of correlation are often descrip- 
tively useful as rough guides in further evalu- 
ations, the study was carried out in correla- 
tional form. It would be misleading, however, 
to take the results at face value in simple cor- 
relational terms as if the results were gen- 
eralizable directly to a broader population. 
The extent to which the correlations reported 
here will hold up in other populations remains 
to be established in future work. The special 
population sampled overrepresents the ex- 
tremes of hypnotic responsiveness as com- 
pared with the general population, While sali- 
ent to the hypotheses under test, such a 
sample might be expected to show higher 
correlations than samples drawn from broader 
populations. Moreover, as the sample size is 
small it should be borne in mind that rela- 
tively few cases may be responsible for a 
given finding. 

One of the major differences between this 
study and others reported in the literature is 
the care with which plateau hypnotizability 
was evaluated. This is a time-consuming pro- 
cedure and was largely responsible for the 
limited sample size, However, we feel it is 
indispensible for some purposes, A simple 
and brief procedure for estimating hypnotic 
depth such as the SHSS was found to reflect 
the hypnotizability criterion after subjects 
had been trained to plateau hypnotic per- 
formance quite accurately (.93), but the 
SHSS yielded only a correlation of .75 with 
the hypnotizability criterion when adminis- 
tered prior to the achievement of plateau, and 
even this correlation is probably spuriously 
high due to the fact that most of the subjects 
already had had a great deal of exposure to 
hypnotic training prior to entering the study. 
Thus, while measures such as the SHSS are 
very useful and may accurately reflect hypno- 
tizability after training, it is dubious whether 
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reliance should be placed on such brief esti- 
mates without extensive hypnotic training to 
plateau hypnotizability if one is attempting 
to assess correlates of established hypnotiz- 
ability rather than of just initial hypnotic 
performance. 

The general confirmation of our predictions 
under the specified conditions is, however, 
only a first step into elucidating the functional 
dependencies underlying the observed corre- 
lations. As was noted earlier, we were con- 
cerned with the extent to which the resulting 
correlations were a function of hypnotizabil- 
ity as a trait as opposed to the extent to which 
they were determined by the demand charac- 
teristics of the experimental situation. We 
wondered whether our own initial hypotheses 
and the subjects’ perception of these hy- 
potheses might not have set into motion in- 
teracting expectancies and other situational 
inferences, subtly altering the pattern of cor- 
relations in the direction of confirming the 
initial predictions. To what extent did postural 
sway predict hypnotizability in the study be- 
cause of an inherent, intrinsic relationship or 
to what extent did postural sway predict hyp- 
notizability because both the experimenter 
and the subjects shared the belief that it 
would predict it? 

It should be emphasized that attitude meas- 
ures were taken after most of the subjects 
had had considerable exposure to hypnotic 
training. It has been noted in some studies 
(e.g., Melei & Hilgard, 1964) that attitudes 
may be influenced by successful or unsuccess- 
ful hypnotic experience. 

The hypothesis that demand characteristics, 
experimenter bias, order effects, and situa- 
tional factors may confound the observation 
of reliable correlates of hypnotizability pro- 
vides a useful context for future empirical 
work and may help explain the conflicting 
results of earlier studies, Certainly the rela- 
tive lack of success noted in the literature in 
identifying reliable correlates of hypnotiz- 
ability is in striking contrast to the relative 
ease with which hypnotic performance may 
be assessed using a work sample. Eventually 
we may learn to identify and isolate the con- 
founding factors by studying how correlates 
of hypnotizability are affected or remain in- 
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varient under differing experimental condi- 
tions, In the present study the investigators’ 
hypotheses were tested under conditions 
closely resembling those in which they were 
derived by the investigators, We have been: 
careful to specify the conditions and the hy- 
potheses, Further studies will attempt to 
evaluate the extent to which these hypotheses 
hold under other conditions and with differing 
sets of expectancies by both experimenter and 
subjects. 
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PREFERENCE FOR INFORMATION ABOUT AN UN- 
CERTAIN BUT UNAVOIDABLE OUTCOME* 


JOHN T. LANZETTA anv JAMES M. DRISCOLL 


Dartmouth College 


The choice of information or no information was examined where outcomes 
were shock-no shock, reward-no reward, and shock-reward. Although informa- 
tion was noninstrumental in that it permitted no overt modification of outcome 
Írequency or severity, Ss chose information more frequently than expected by 
Chance. There were large and consistent individual differences in preference for 
information and search remained stable despite changes in outcome conditions. 
GSR was used to determine if information permitted S to modify his reaction 
to an outcome, but results provided only weak support for this expectation. 
The relationship of the probability and latency of search, however, suggested an 
uncertainty-conflict formulation; where search was most frequent, latency of 


search was longest. 


> 


A number of studies (Lockhard, 1963; 
Perkins, Levis, & Seymann, 1963; Pervin, 
1963; Prokasy, 1956) have shown that or- 
ganisms prefer information about an outcome 
even when the information has no apparent 
instrumental value. In the paradigm em- 
ployed in these investigations, the subject is 
confronted with a choice between two condi- 
tions (A, B) in both of which he will receive 
an outcome such as shock or reward with 
probability P. However, in Condition A, the 
occurrence of the outcome is signaled by dis- 
tinctive cues while in Condition B it is not. 
The cues inform the subject what outcome 
will occur, but the subject is unable to 
overtly change the immediacy, probability, or 
magnitude of the outcome. Since outcomes 
are independent of the subject’s behavior, 
choices of the information or no-information 
condition are reinforced equally often and 
one would not expect a consistent preference 
to develop. The data, however, indicate that 
decided preferences are acquired: with either 
food (Prokasy, 1956) or shock (Lockhard, 
1963; Perkins et al., 1963) as the outcome, 
rats develop a consistent preference for the 
information condition; with shock as the out- 
come, human subjects rate an information 


1 This research was supported by the Air Force 
Office of Aerospace Research, Contract AF 49(638)- 
1441 and the Office of Naval Research, Contract 
Nonr-2285(04) Project MRO005.12-2005.01 from the 
Behavioral Science Division of the Naval Medical 
Research Institute, Bethesda, Maryland. 


condition (Pervin, 1964) as preferred fo & 
no-information condition. 
When information is in some way instru- 


mental, reinforcement theory provides a reas- 
onable basis for explanation and prediction 1 


(Wyckoff, 1952) of information-search be 
havior. However, when information has no 
apparent instrumental value, a generally ac 
cepted theoretical explanation has not been 
advanced. Some investigators extend rein- 
forcement theory to this situation by sug 
gesting covert ways in which information cam 
be instrumental; Prokasy (1956) and Lock- 
hard (1963), for example, hypothesize that 
information permits the subject to make 
preparatory responses which render a reward 
more “positive” or y 
“negative.” Pervin (1964) suggests that in- 
formation permits the subject to form am 
appropriate expectancy and to develop a pre 
dictive strategy thereby eliminating surprise 
and conflict; a position similar to that 0 
Berlyne (1960), whose conflict-arousal formu- 
lation suggests that outcome uncertainy gel 
erates conflict and arousal which motivates 
information search, Other investigators post 
late the existence of a “need” to know (Brim 
& Hoff, 1957) or a “need” for information 
(Jones, Wilkinson, & Braden, 1961) which 
motivates the search for information iM 
situations of uncertainty. 

Implicit, if not explicit, in all of these 
theories is the recognition that both the level 


of uncertainty and the nature of the outcome — 
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a punishment less 
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are important variables whose effects on 
overtly noninstrumental search behavior must 
be explored. Furthermore, the state of the 
organism—its conflict (Pervin, 1964), arousal 
(Berlyne, 1960), or modified affective reac- 
tion to outcomes (Lockhard, 1963; Prokasy, 
1956)—is suggested ‘as a critical explanatory 
variable. : ; 

The present study was désigned to examine 
the effects on seatch of type of outcome 
(whether positive or negative), and to moni- 
tor the affective state of the subject in an 
attempt to clarify the basis for overtly non- 
instrumental search behavior. The study had 
three objectives: to extend to the human level 
the analysis of the effects of uncertain and 
unavoidable outcomes on information search 
where, to date, only Pervin’s (1964) paired 
comparison preference data are available;*to 
assess the differential effects of type of out- 
come (shock, monetary reward) on search; 
and in particular to measure possible changes 
in the affective or arousal state of the subject 
in response to information. 


* METHOD 
Subjects 


Subjects were 24 male undergraduates, recruited 
individually from freshman physical-education classes 
at the University of Delaware. 


Apparatus 


Subjects sat before a table containing a poker- 
chip delivery mechanism, a response box, and a 
signal panel. The signal panel presented three lights: 
an amber warning signal and a green and a red 
* information signal. The response box contained two 
buttons, the right one labeled "information" and the 
left one labeled *no information." The chip delivery 
mechanism was simply a large black box that auto- 
matically delivered a chip on rewarded trials. Shock 
electrodes were available for attachment to the calf 
of the subject's leg. The electrodes were two metal 
disks about 2 centimeters in diameter and 5 centi- 
meters separated. 

The warning light, information lights, and out- 
comes were programed using a Massey-Dickinson 
programmer. A probability generator selected out- 
comes with a probability of .5. 

Subject’s GSR (skin potential) was measured as 
the voltage difference between two 2-centimeter elec- 
trodes on the palm and dorsum of the subject's left 
hand. This voltage difference was amplified and 
recorded by an Offner Type RS dynograph. 

The occurrence of the signal lights and the sub- 
ject's choice of information or no information were 


recorded by an Esterline-Angus event recorder. Pro- 
graming and recording equipment were in an ob- 
servation room separated írom the experimental 
room by a one-way mirror. When the experimenter 
finished instructing the subject, he remained in the 
Observation room until the session was completed. 


Design 


Each subject was given 90 trials on which he had 
the choice of acquiring or not acquiring information 
about which of two equally likely outcomes would 
occur. The trials were divided into three blocks of 
30 trials, each block presenting a different outcome 
combination. The three outcome combinations were: 
shock-no shock, reward-no reward, and shock- 
reward, Subjects received the shock-no-shock and 
reward-no-reward trials in counterbalanced order; 
the shock-reward condition was presented last. 


Procedure 


During placement of the GSR electrodes, the 
subjects were instructed that the experimenter was 
interested in studying how individuals respond 
physiologically to pleasant and unpleasant outcomes. 
The subject was told what outcomes were possible 
in the first situation and the experimenter stressed 
that each outcome would occur equally often on 
a random basis—which occurred being impartially 
controlled by a randomizing "machine." The subject 
could discover which outcome was to occur by push- 
ing a button labeled "information" during the 5- 
second warning signal (amber light). This choice 
resulted in either a red or green light indicating 
which outcome was to follow (in the shock-no-shock 
condition red indicated shock and green no shock; 
in the reward-no-reward condition red signaled no 
reward and green, reward; in the shock-reward con- 
dition red signaled shock while green indicated 
reward). One of these information lights lit at the 
termination of the warning signal and remained lit 
for 5 seconds. Following this, there was a 5-second 
delay before the outcome occurred. If the subject 
depressed the “no-information” button, there was 
a 10-second period (with no signal) before the 
outcome occurred. The subject was warned to press 
one of the buttons on every trial or he would be 
automatically shocked for not responding. The inter- 
trial interval was 5 seconds. 

Shock was always a pulse of .5 second duration 
at a voltage level set by the subject as "annoying" 
prior to the first session. Reward was a chip worth 
$.10 to the subject after the experiment. 


RrsuLTS? 
Information Search 


Subjects demonstrated a preference for 
information in all three outcome conditions, 
2We are indebted to Elaine Thompson for her 


invaluable assistance in collecting the data and 
analyzing the results. 
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but there were no differences in the level of 
search between conditions. The mean number 
of information-search responses over the 30 
trials was 19.5, 18.0, and 18.8 for the shock- 
no-shock, reward-no-reward, and shock-re- 
ward conditions, respectively. T tests (Dixon 
& Massey, 1957, p. 100) on the mean fre- 
quency of search per subject showed the 
means for all three outcome conditions to be 
significantly above that expected by chance 
assuming an equal probability of information 
and no-information responses (p < .025), but 
individual comparisons (p. 107) showed no 
significant differences in the mean frequency 
of information search among the three experi- 
mental conditions. Information search per 
subject ranged from 23 to.89 responses for 
the three sessions combined (90 trials), but 
only 5 of the 24 subjects searched on less 
than half the trials. 

Intercorrelations of the number of informa- 
tion-search responses per subject across the 
three experimental conditions indicate that 
the subject's search in one condition corre- 
lated positively with his search in all other 
conditions (shock-no shock with reward-no 
reward, 7 = .69; shock-reward with reward- 
no reward, r = .61; shock-reward with shock- 
no shock, r = .48; all p’s < .025). In general 
then, the subjects prefer information to no 
information whatever the nature of the 
uncertain outcome, but there are large 
and consistent individual differences in such 
preferences, 


Latency 


The latency of the subject’s choice of in- 
formation or no information was recorded 
because of its direct relevance to a conflict 
interpretation of search (Berlyne, 1962). 

There were no significant differences among 
conditions in the time it took the subjects to 
choose a button to indicate a preference 
(for information and no-information responses 
combined). The mean latency was 1.94, 1.88, 
and 1.80 seconds for the shock-no-shock, 
reward-no-reward, and shock-reward condi- 
tions, respectively. Separating information re- 
sponses from no-information responses, mean 
latency across all three experimental condi- 
tions was 1.88 seconds for information-search 
responses and 1.89 seconds for no-information 
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responses. Within conditions, the largest dif- 
ference between information and no-informa- 
tion response latencies was a negligible .06 
second occurring within the shock-no-shock 
condition. Again, large and consistent indi- 
vidual differences were evident and correla- 
tions of latency scores across experimental 
conditions by subjects were significantly posi- 
tive (shock-no shock with shock-reward, 7 
= .66; shock-reward with reward-no reward, 
r= .64; shock-no shock with reward-no 
reward, r = .66; all ?'s < .01). 


Galvanic Skin Response (GSR) 


GSR response to outcomes. Preparatory 
response theories of search suggest that in- 
formation about an imminent event permits 
the subject to modify his reaction to” the 
outcome, To examine if the subjects differen- 
tially react to outcomes, with and without 
information, the subject’s GSR response was 
measured at the time the outcome was pre 
sented. The criterion used was the voltage 
difference between the base line at the onset 
of the outcome (or a comparable point when 
nothing occurred) and the peak of the largest 
possible deflection during the 5 seconds prior 
to the onset of the next trial. Where only & 
negative deflection occurred, its magnitude 
was measured and the sign of the deflection 
was retained, Though there is some ambiguity 
regarding the meaning of positive and nega 
tive skin potentials (see Holmquest & Edel- 
berg, 1964; Shaver, Brusilow, & Cook, 1962); 
the occurrence of predominantly positiv 
average scores per subject following shock (19 
subjects out of 24) and negative average 
scores per subject following no shock (16 sub- 
jects out of 24) in the shock-no-shock condi- 
tion supports the general notion that negative 
scores represent small reactions and positive 
scores represent large reactions. k 

Table 1 presents the mean difference 1? 
voltage per subject per trial for the outcomes 
of the experimental conditions, Of the various 


individual comparisons (Dixon & Masse». 


1957) between the mean reaction to out 
comes under the three outcome conditions 
only the following were significant: the dif 
ference between the subject’s reaction t0 
shock and no shock following informatio? 
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(p< .01) and no information (p < .02) in 
the shock-no-shock condition, and the dif- 
ference between the mean reaction to shock 
and reward in the shock-reward condition 
when preceded by information (p < .05). 
This difference was not significant when 
preceded by*no information (p < .10). 
GSR response to information. 'To examine 
if subject’s reaction to information cues could 
account for his preference for information, 
, the reaction to information was measured em- 
ploying the same criterion used in assessing 
the reaction to outcome. However, this mea- 
sure was taken during the period from the 
onset of the information signal to the onset 
of the outcome (Table 1). Within the shock- 
no-shock condition,*information that shock 
was to occur produced a significantly greater 
reaction than information that no shock was 
to occur ($ < .01), or the absence of infor- 
mation signals (p < .02). There were no sig- 
‘nificant differences in the reactions to in- 
formation in the reward-no-reward or shock- 
reward conditions. e 
To further examine individual differences 
in the task, the Subject’s GSR reactions to 
related information and outcomes within con- 
ditions were correlated. Within conditions, all 
correlations were significantly positive beyond 
the .01 level, indicating individual consistency 
in reaction to information and outcomes. 
Only one correlation of GSR scores by the 
subject across conditions was significant; the 
reaction to shock in the shock-no-shock and 
reward-shock conditions (r= .82, p < .01). 
. The range of scores elicited by reward and no 
shock were probably too restricted to yield 
meaningful correlations. 


Further Examination of Individual Differ- 
ences A 


The consistency in individual preferences 
for information or no information suggests 
that individuals may differ in responsiveness 
to the outcomes, to information, or to out- 
comes when preceded or not preceded by 
information, Such differential reactions, if 
they exist, should be most pronounced under 
the shock-no-shock condition. Therefore for 
this condition correlations were computed be- 
tween the percentage of information search 
responses and the following indices: GSR re- 
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TABLE 1 


MEAN CHANGE IN SKIN PoreNTIAL (IN Miruvorrs) 
IN RESPONSE TO CUES AND OUTCOMES 
on Four TRIAL TvPrs 


Trial type Information Outcome 

Information 

Shock .886 1.286 

No shock —.247 —.143 
No information 

hock 198^ 1.377 

No shock .337 
Information 

Reward .107 +204 

No reward 049 —.092 
No information 

Reward —.597* —.042 

No reward .058 
Information 

Shock 179 «655 

Reward —.197 —.120 
No information e 

Shock —0.27a 779 

Reward .281 


* With no information, trials are identical until the outcome 
occurs. 


sponse to shock after information (7 = .14) 
and after no information (r = .19), GSR re- 
sponse to information that shock would occur 
(r = .20), difference in GSR response to in- 
formation that shock would occur and infor- 
mation that shock would not occur (r = .25), 
difference in GSR response to shock with 
warning and without warning (r= —.22). 
None of these correlations reached signifi- 
cance, 


Sequential Analysis 


Preparatory response or secondary rein- 
forcement theories would generally predict 
changes in search across trials within experi- 
mental conditions; that is, as the subject 
learns to make appropriate preparatory re- 
sponses, or as cues acquire secondary rein- 
forcing value, subjects should show an in- 
creasing preference for information, On the 
other hand, theories emphasizing the role of 
uncertainty would predict rather stable pref- 
erences for information across trials, To ex- 
amine the extent of changes as a function of 
prior experience several sequential analyses 
were done. 

Duncan (1955) range tests on the average 
search per trial within experimental condi- 
tions showed no consistent trends across 
trials, Similarly, Duncan range tests on mean 
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latencies per trial showed no consistent trends 
across trials. 

These results indicate that trials per se 
had little or no effect on the frequency of 
search or the latency of the choice response. 
It is also possible, however, to examine the 
effects of more immediate and more specific 
past experience on the subject's search be- 
havior. To do this, trials were divided into 
four categories according to the subject's 
choice for information or no information and 
the specific outcome that occurred. 

Table 2 presents the percentage of informa- 
tion responses following each of these four 
trial types. Individual comparisons (Dixon & 
Massey, 1957) (p < .05) showed that within 
the shock-no-shock condition, the propertion 
of search responses was significantly higher 
following ^no-information, no-shock" trials 
and “no-information, shock” trials than fol- 
lowing “information, shock” trials. Search 
following “information, no-shock” trials, how- 
ever, was not significantly different from 
search on the other three trial types. Within 
the reward-no-reward condition, the propor- 
tion of search responses following “no- 
information, no-reward" trials was signifi- 
cantly greater than that following the other 
three types of trials, but there were no sig- 
nificant differences between the proportion of 
search responses following different trial types 
within the shock-reward condition. It should 
be noted, however, that the mean difference 
in search between “no-information, reward” 
and “information, shock" trials reached the 
p < .06 level. 


TABLE 2 


PERCENTAGE OF INFORMATION SEARCH RESPONSES 
FOLLOWING Four TYPES OF TRIAL 


Type of previous trial 


Information No information 
Shock No shock Shock No shock 
58.8. 62.6.5 69.2y 76.0» 
Reward No reward Reward No reward 
46.8, 59.4, 53.8, 53.5, 
Shock Reward Shock Reward 
56.6, 65.0, 62.5. 67.2, 


Note.—Mean differences were tested using p < .05. Within 
experimental conditions, cells with the same subscript are not 
different. 
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TABLE 3 


MEAN LATENCY IN SECONDS PER SUBJECT 
FOLLOWING Four TYPES OF TRIAL 


Type of previous trial 


Information No information 
Shock-no-shock condition 
— 
Shock No shock Shock 
1.90, 1.80. 1.87. 
Reward-no-reward condition 
Reward Noreward Reward 
1.89, 1.74. 1.83, 
Reward-shock, condition 
Shock Reward Shock 
> 1.85 1.81, 1.65, 


Note.—Within a condition, cells with the same subscript 
not significantly different. 


An identical analysis on mean latency fo 
lowing the four trial types (Table 3) show 
that within the shock-no-shock conditi 
response latencies (for information and T 
information responses combined) were 8 
nificantly longer following "no-informatio 
no-shock” trials than latencies on the remai 
ing three trial types. Within the reward 
reward condition, “no-information, no-re 
trials were again followed by the lon 
latencies although the mean difference 
tween latency following these and “inform 
tion, reward” trials failed to reach stati 
significance. Within the reward-shock com 
tion, the longest latencies followed “infor m 
tion, reward” and “information, shock” 
although the difference in latency follov 
these and “no-information, no-reward” 
was not significant. 


Discussion 


The results of the present study indic 
that humans do prefer information about í 
uncertain outcome even when unable 
overtly modify the outcome. The results 4 
consistent with those obtained by Pro! 
(1956) and Lockhard (1963) using rai 
an appetitive and aversive situation, T 
tively, and with Pervin's (1963) paired c 
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parison preference data on humans, Unfor- 
tunately, the results contribute less than 
desired to a clarification of the basis for the 
observed information preferences, 

One finding does suggest that the subject 
can modify his reaction to an outcome when 
he is informed of its imminent occurrence. 
In the shock-reward condition, when informa- 
tion was chosen, the GSR* reactions to re- 
ward and shock were significantly different, 
whereas without information they were not; 
with information, both the reactions to shock 
and reward were slightly less intense than 
with no information. The attenuation of the 


` subject’s GSR reaction suggests that the role 


of information may be in the elimination of 
"surprise" at the octurrence of an outcome 
(Pervin, 1964). Since this result was peculiar 


to tfe one experimental condition, a prepara-. 


tory response explanation is only weakly 
supported. 

Secondary reinforcement explanations 
(Wyckoff, 1959) inimediately encounter dif- 
ficulty when applied ¢o the aversive situa- 
tion since it is difficult to assume that organ- 
isms will show a reference for a cue associ- 
ated with a negative outcome. The present 
GSR results indicate that the cue indicating 
shock did take on negative properties which, 
according to secondary reinforcement notions, 
would be expected to produce avoidance of 
the cue, 

In general, the findings preclude acceptance 
of hypotheses which attribute search to the 
Value of information in changing outcome 


- Properties. Even if human subjects enter the 


situation with «well-developed search habits, 
Some change over trials within outcome con- 
ditions would be expected if the ability to 
modify outcomes was operative. Search habits 
might be expected to: be ‘hegatively rein- 
forced if they were not instrumental, be posi- 
tively reinforced if they were instrumental, 
increase or decrease in strength as cues ac- 
quire positive or negative secondary rein- 
forcement value, increase in strength as the 
subject learns what preparatory responses are 
appropriate. The sequential analyses do not 
Provide support for any of these expectations. 

Several findings do favor an uncertainty 
interpretation, however: 
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l. Large and consistent individual differ- 
ences in search are suggestive of similar indi- 
vidual differences in need for certainty (Brim 
& Hoff, 1957) and preferences for informa- 
tion input (Glanzer, 1958). 

2. The stability of search across trials and 
outcome conditions is consistent with proper- 
ties of uncertainty within and across condi- 
tions (uncertainty is constant at one bit). 
Furthermore, the level of search under this 
level of uncertainty is about that expected 
from previous work (Driscoll & Lanzetta, 
1965). 

3. The responsiveness of search to im- 
mediately previous experience suggests the 
operation of a variable such as uncertainty. 


` Search is highest, following "empty" trials; 


that is, those trials on which information and 
outcomes are both absent, The necessity of 
tolerating uncertainty over a longer time 
period during this type of trial results in 
greater information search on the following 
trial. This interpretation is supported by the 
absence of significant trial-type contingencies 
in the two-outcome, shock-reward, condition. 
An outcome always occurs in this situa- 
tion, there is never a prolonged interval of 
uncertainty. 

The hypothesis that uncertainty motivates 
information search directly, however, is in- 
consistent with the latency results, If in- 
creased periods of uncertainty increased 
motivation to search, choice latencies should 
be short, after a long period of uncertainty, 
Latencies following “empty” trials, however, 
were long even though such trials increased 
the probability of a subsequent search re- 
sponse. This is a perplexing finding since it 
is usually found that as the probability of a 
response increases its latency decreases, Per- 
haps conflict notions are most appropriate 
(Berlyne, 1962). If it is assumed that search 
results from uncertainty-produced conflict, 
choice latency might be expected to be long 
as a consequence of the conflict. The support 
for uncertainty and conflict notions is, at this 
point, admittedly a posteriori, and direct test- 
ing of the implications of the theory will be 
required. A study varying one factor which 
would be expected to influence search, the 
uncertainty of outcomes (relative frequency), 
is now nearing completion, 
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In summary, human subjects appear to 
prefer information about outcomes even when 
the information has no apparent instrumental 
value. This preference is stable over variations 
in outcome conditions indicating that the 
nature of the uncertain outcome has little 
effect on the preference. Some data suggest 
that the subject, when he receives informa- 
tion, is able to modify his affective reaction 
to an outcome, but the effect is neither pro- 
nounced nor pervasive. Preparatory response 
explanations of information preferences are, 
thus, only weakly supported by the data. 
Furthermore, the conditioned GSR reaction 
to information cues in the aversive situation 
would not support a secondary reinforcement 
explanation of information preferences The 
most tenable working hypothesis, considering 
the relationship between search and latency 
data, is that uncertainty-induced conflict 
motivates information acquisition even when 
the information has no apparent instrumental 
value. 


REFERENCES 


Bertyne, D. E. Conflict arousal and curiosity. New 
York: McGraw-Hill, 1960. 

Brrtyne, D. E. Note on food deprivation and ex- 
trinsic exploratory responses. Psychological Re- 
ports, 1962, 11, 162. 

Brim, O. G., & Horr, D. B. Individual and situa- 
tional differences in desire for certainty. Journal of 
Abnormal and Social Psychology, 1957, 54, 225- 
229. 

Drxon, W. J., & Massey, F. J., Jn. Introduction to 
stone analysis. New York: McGraw-Hill, 
1957. 


Joun T. Lanzetra AND James M. DRISCOLL 


DRISCOLL, J. M., & Lanzetta, J. T. Subjective un- 
certainty and predecisional information search and 
processing. Psychological Reports, 1965, 14, 975- 
988. 

Duncan, D. B. Multiple range and multiple F tests. 
Biometrics, 1955, 11, 1-42. 

GraNzzn, M. Curiosity, exploratory drive, and stimu- 
lus satiation. Psychological Bulletin, 1958, 55, 302- 
315. : » 

Hormguest, D..& EprrsEeRo, R. Problems in the 
analysis of the entlosomatic galvanic skin response. 
Psychophysiology, 1964, 1p 48-54. 

Jones, A., WirxiNsoN, H. J., & Braben, J. Informa- 
tion deprivation as a motivational variable. Jour-* 
nal of Experimental Psychology, 1961, 62, 126-131. 

Locknamp, Joan A. Choice of warning signal or no 
warning signal in an unavoidable shock situation. 
Journal of Comparative and Physiological Psy- 
chology, 1963, 56, 526-530. 

Perxins, C. C., Levis, D. J., & Seymann, R. Pref- 
erence for signal-shock vs. shock-signal. Psycho- 
logical Reports, 1963, 13, 735-738. 

Pervin, L. A. The need to predict and control“under 
conditions of threat. Journal of Personality, 1963, 
31, 570-587. 

PrnviN, L. A. Predictive strategies and the need to 
confirm them: Some notes on pathological types 
of decisions. Psychological Reports, 1964, 15, 99— 
105. 

Proxasy, W. F. The acquisition of observing re- 
sponses in the absence of djfferential external re- 
inforcement. Journal of Comparative and Physio- 
logical Psychology, 1956, 49, 131-134. 

Suaver, B. A, Brusttow, S. W., & Cooxz, R. E. 
Origin of the galvanic skin response. Proceedings 
of Experimental Biology and Medicine, 1962, 110, 
559. 

Wvcxorr, L. B., Jr. The role of observing responses 
in discrimination learning: Part I. Psychological 
Review, 1952, 59, 431-442. 

Wycxorr, L. B., Ja. Toward a quantitative theory 
of secondary reinforcement. Psychological Review, 
1959, 66, 68-77. 


(Early publication received July 26,:1965): 


Journal of Personality and Social Psychology 
1966, Vol. 3, No. 1, 103-105 


BRIEF ARTICLES 


PALMAR SWEATING AS A FUNCTION OF INDIVIDUAL 
DIFFERENCES IN MANIFEST ANXIETY? 


H. CARL HAYWOOD 
e George Peabotly College 


AND 


CHARLES D. SPIELBERGER 


Vanderbilt University 


The relationship Between individual differences in anxiety as measured by 
scores on*the Taylor Manifest Anxiety (MA) scale and physiological arousal 
as reflected in the Palmar-Sweat Index (PSI) was investigated. PSI measures 


were taken after a period of adaptation at the beginning of a verbal con- 


ditioning experiment and later during the same experiment. The PSI for high- 

anxiety Ss was significantly higher on both occasions than was that of low- 

anxiety Ss, and the PSI measures of both groups declined during the experi- 

ment. These findings suggest that the PSI is a sensitive measure of individual 

differences in anxiety, and that caution should be exercised in assumptions 
i regarding the extent to which experimental situations induce arousal. 


Although relationships between self-report 
measures of anxiety and physiological indices of 
emotionality or arousal have been explored in a 


. number of recent investigations, there has been 


little consistency in the findings of such studies. 
In most investigations no relationship has been 
found between anxiety inyentories and individual 
physiological measures of arousal (e.g, Beam, 
1955; Katkin, 1965» McGuigan, Calvin, & Rich- 
ardson, 1959; Silverman, 1957). Positive rela- 
tionships between scores on anxiety inventories 
and indices of arousal were obtained, however, 
in two studies, In these the physiological measure 
was change in arousal from a preexperimental 
adaptation period to a period during the experi- 
ment in which the subjects were presumably 
performing under greater stress (Haywood, 1961; 
Runquist & Ross, 1959). 

The purpose of this study was to investigate 
the relationship between individual differences in 
‘anxiety and physiological arousal in a psycho- 
logical experimeht in which mo specific experi- 
mental operations were employed to induce 
arousal. More specifically, the question explored 
in the present study was whether or not subjects 
with extreme scores on the Taylor (1953) Mani- 
fest Anxiety (MA) scale can be distinguished on 


1We wish to express appreciation to Richard 
Gorsuch, Joe McCarthy, and Richard Ratliff for 
their assistance in the collection and analysis of the 
data obtained in this study. During the period of 
this research the first author was supported by Grant 
MH-6159 from the National Institute of Mental 
Health. Further support was derived from grants to 
the second author from the National “Institute of 
Mental Health (Grant MH-7446) and Child Health 
and Human Development (Grant HD-947), United 
States Public Health Service. 
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the basis of their Palmar-Sweat Index (PSI) at 
the beginning of a verbal-conditioning experi- 
ment and later during the same experiment. On 
the basis of previous findings, it was expected, 
after a short adaptation period in which there 
was no stimulation deliberately introduced to 
arouse the subjects, that high-anxiety (HA) and 
low-anxiety (LA) subjects would not differ in 
PSI. During the experiment, however, it was 
expected that the PSI scores for all subjects 
would increase since, as Taylor (1956) has 
pointed out, most college students perceive 
psychological experiments as somewhat threaten- 
ing. Furthermore, it was expected that the PSI 
scores of HA subjects would increase more than 
those of LA subjects because of the presumed 
greater emotional responsiveness of HA. subjects 
to mildly threatening situations (Spence, 1958). 


METHOD 
Subjects 


The subjects were 61 male undergraduates en- 
rolled in the introductory psychology course at 
Vanderbilt University who were selected from a 
larger pool on the basis of their scores on the MA 
scale. HA subjects had MA scale scores of 21 and 
above, while LA subjects had MA scale scores of 8 
and below. These scores represented approximately 
the upper and lower quartiles of the total distribu- 
tion of MA scale scores. 


Apparatus and. Procedure 


Arousal was measured by the PSI, using a photo- 
metric technique, described previously by Haywood 
(1963). The single modification in the PSI tech- 
nique was the use of an automatic fingerprinter. 
The PSI prints were made on transparent Kodapak 
K-509 film impregnated with tannic acid. The auto- 
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TABLE 1 


Mean PSI or HiGH-ANXIETY AND LOW-ANXIETY Sus- 
JECTS AFTER A PERIOD OF ADAPTATION AT THE 
BEGINNING OF A VERBAL CONDITIONING 
EXPERIMENT AND LATER DURING THE 
Same EXPERIMENT 


Occasions of measurement 


I Il 
High anxiety 16.00 13.75 
(8.77)* (7.52) 
Low anxiety 12.38 9.69 
(5.81) (5.25) 
t 1.92* 2. 
a Parenthetical entries are standard deviations. 
*  «.10. 
** b <.02. 


matic fingerprinter applied the film agaipst the 
palmar surface of the finger with a constant pressure 
of 4 pound for a period of 30 seconds. 

As each subject entered the experimental room, 
the PSI procedure was explained to him and a dem- 
onstration PSI print was made. After an adaptation 
period of approximately 5 minutes, during which 
the subject rested and was given instructions for a 
sentence-construction verbal conditioning task 
(Spielberger, 1962), another PSI print (Print I) was 
made to assess each subject’s basal (adapted) level 
of arousal. The subjects were then required to con- 
struct 20 sentences in the operant (nonreinforced) 
period of the verbal conditioning task.2 A final PSI 
print (Print II) was then made. For all subjects the 
demonstration print was taken from the index finger 
of the nondominant hand, and Prints I and II were 
taken from the index and ring fingers of the domi- 
nant hand. The experimenter had no knowledge of 
the MA scale scores of individual subjects. 

The measures of arousal were the density of the 
PSI prints on the two occasions of measurement 
following the demonstration print. This was deter- 
mined with a Lab-line densitometer which yielded 
readings on a density scale of 0-100. The O point in- 
dicated absolute transparency; the 100 point indi- 
cated absolute opacity. 


RESULTS 


The mean density scores for PSI Prints I and 
II of the HA and LA subjects are given in Table 
1, in which it may be noted that the PSI for 
HA subjects was higher on both occasions than 
was that of LA subjects. It may also be noted 
that, contrary to expectation, the PSI declined 
for both groups from Print I (the adaptation 
print) to Print II (the print taken during the 


2 The verbal conditioning findings will not be 
reported. here, In general, these were consistent 
with results previously reported (Spielberger, 1962; 
Spielberger, Berger, & Howard, 1963; Spielberger, 
DeNike, & Stein, 1965). 
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experiment). Analysis of variance of these data * 
yielded an F ratio for anxiety levels of 6.12 (df i 
= 1/59, p «.025), and an F ratio for trials of | 
712 (df=1/59, p< 01). The interaction F 
ratio was not significant (F = .06, df = 1/59). 


Discussion 


There were two surprising results in the pres- 
ent study. First, although it was anticipated that 
the HA and LA groups would not differ in level 
of palmar sweating after an adaptation period, it - 
was found that the HA subjects were signifi- = 
cantly higher than were the LA subjects in this — © 
measure of arousal. Second, both groups de- 1 
creased in palmar sweating during the experi- 
ment; this decline was not different for the two. 
anxiety groups. : 

A reexamination of the procedure leads to the 
tentative conclusion that the PSI measure taken 
after a period of adaptation at the beginning of '- 
the experiment, did not, in fact, represent a — 
basal, or adapted, level of arousal. It seems " 
quite probable that the ambiguous instructions 
for the verbal-conditioning task preceding the 
“basal” PSI induced arousal in both HA and LA 
subjects, and, adaptation occurred following this 
period. The relative Simplicity of the verbal- 
conditioning task may well have been less arous- 
ing than was the anticipation of it. Assuming 
the validity of this interpretation, the results are 
generally consistent with previous findings of 
higher arousal in HA subjects than in LA sub- 
jects when stressful stimulation is imposed on 
both groups. It is widely accepted that individual 
differences in anxiety, as measured by self-report 
inventories, are related to physiological indices ' 
of autonomic nervous system activity (Cattell & 
Scheier, 1961). However, except when arousal 
indices are based on the difference between meas- | 
ures of arousal taken during "neutral" and 
“stress” conditions, the predicted systematic re: É 
lationships between scores on anxiety inventories 

and indices of physiological arousal are rarely 
found (e.g. Beam, 1955; Katkin, 1965). The | 
relationship between anxiety and the PSI ob- 
tained in the present study may reflect improve- < 
ments in the technology of the PSI, and would 
seem to indicate that the PSI is a sensitive meas- f 
ure of autonomic arousal. Even the mildly stimu- 
lating situation employed in this experiment 

(presumably anticipation of a simple verbal- . 

conditioning task) produced measurable differ- - 

ences in palmar sweating which, moreover, were 1 

significantly related to anxiety. k 
The findings in this study also suggest caution 
regarding assumptions of the extent to which 
experimental situations will induce arousal. What 


BRIEF ARTICLES 


“was believed to be an affectively neutral period 
at the beginning of the experiment turned out to 
induce more arousal in both HA and LA sub- 
jects than did the experimental task which, on a 
priori grounds, was believed to be more threat- 
ening. 
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part by different li 


Individual differences in the duration of the 
spiral aftermovement have been observed by 
many investigators, and both personality con- 


1' These data were collected by the first author in 
preparing a master’s thesis at the University of 
Pittsburgh, 1964. The research was directed by the 
second author, and supported by Grant M-3880, 
from the National Institute of Mental Health, 
United States Public Health Service. 

2 Now at the University of Wisconsin. 
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ACTIVATION, CONTROL, AND THE SPIRAL AFTERMOVEMENT* 


PAUL LEVY anp PETER J. LANG? 
University of Pittsburgh 


60 college students were selected on the basis of anxiety and impulsivity 
questionnaires. They were then tested for duration of the spiral aftermove- 
2 ment and resting cardiac rate. Longer aftermovements were found for high- 
anxious §s than for low-anxious Ss; impulsive students showed shorter after- 
movements than nonimpulsives. Aftermovement duration was significantly 
related to an interaction between heart rate and heart-rate variability. It was 
concluded that individual differences in the aftermovement are explained in 
levels of activation and control, as these constructs are 
assessed through personality questionnaires and by cardiovascular activity. 


structs and hypothesized physiological states have 
served in theoretical explanations, Eysenck, Hol- 
land, and Trouton (1957) observed a decrease 
in aftermovement following the administration of 
sodium amytal. Costello (1960) found that 
meprobamate, another central nervous system 
depressant, yielded a similar effect. Eysenck 
(1957) attributed these results, and the reduced 
aftermovement sometimes observed in brain- 
damaged subjects, to a hypothetical "cortical 
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inhibition." Furthermore, this inhibitory state 
was held to be high in subjects diagnosed hys- 
teric or psychopathic, and in subjects with high 
scores on the Maudsley Extroversion scale. 
Claridge (1960) reported data consistent with 
the last hypothesis. However, Holland and Ey- 
senck (1960) observed no relationship between 
the aftermovement and extraversion in a study of 
industrial apprentices. Furthermore, Mayer and 
Coons (1960) have presented findings which 
question the hypothesis of short aftermovement 
in brain-damaged subjects. They found no after- 
movement differences between functional and 
organic patients when the instructions were re- 
assuring. 

Claridge (1960) failed to find that hysterics re- 
port significantly shorter aftermovement than 
normals. However, dysthymics and schizophrenics 
did have longer than normal aftermovements. 
Anxiety characterizes the former diagnosis and 
Lang and Buss (1965) present strong evidence 
that schizophrenics are chronically hyperaroused. 
These results, considered together with the studies 
of depressant drugs, suggest an alternative hy- 
pothesis: long spiral aftermovements are related 
to high levels of arousal or activation. Eysenck 
and Holland (1960) reported negatively on this 
proposition in a study of drive and the Archi- 
medes spiral. However, their experimental manip- 
ulation had questionable validity: low-drive 
subjects were factory workers and high-drive 
subjects were simply applicants being evaluated 
for the same job. 

The relationship between the aftermovement 
and activation deserves further exploration. The 
Taylor (1953) Manifest Anxiety (MA) scale is 
a useful questionnaire measure of the latter con- 
struct, While it is by no means factorially pure, 
there is evidence that this scale correlates both 
with physiological indices of arousal (Barratt, 
1962) and the clinical analogue of arousal, anxi- 
ety (Buss, Wiener, Durkee, & Baer, 1955; Sieg- 
man, 1956). 

Pavlovian cortical inhibition seems to have 
little value as an explanation of aftermovement 
findings. However, length of aftermovement may 
well be related to behavioral inhibition or im- 
pulsivity. Although the subject is instructed to 
report on the presence or absence of a subjective 
phenomenon, it is the latency of the verbal report 
that is actually measured. The subject's basic 
capacity to delay responding could contribute 
importantly to this measure. Eysenck and Ey- 
senck (1963) report that their extraversion scale 
includes subfactors of sociability and impulsivity, 
but the latter trait has not been separately as- 
sessed in previous studies of the aftermovement. 
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Despite the organismic tack taken by inter- 
preters of the aftermovement phenomenon, little 
effort has been made to directly measure relevant 
somatic activity. However, activation has fre- 
quently been identified with high activity levels 
in the autonomic nervous system (Freeman, 
1948; Malmo, 1958). If long aftermovement is 
related to a high level of arousal, a positive 
correlation with heart rate would be anticipated. 
Furthermore, Lacey and Lacey (1958) suggest 
that variability in autopomically mediated re- 
sponses is characteristic of impulsive individuals. 


They suggest that feedback via blood pressue, ^ 


sensitive receptors in the carotid artery causes 
momentary changes in cortical activity and as- 
sociated motor behavior. These authors specu- 
late that perceptual-motor processes may be 
greatly influenced by this phenomenon, and that 
these effects are most teadily seen in subjects 
with highly variable cardiac rate. In support of 
ihis position, they have demonstrated a negative 
relationship between the ability to inhibit er- 
roneous responses in a reaction-time situation 
and cardiac rate lability (Lacey, 1959; Lacey & 
Lacey, 1958). 

Cardiac rate variability and average cardiac 
rate are relatively uncerrelated (Lacey & Lacey, 
1958). In this they parallel the questionnaire 
derived traits of impulsivity and anxiety (Bar- 
ratt?). These pairs of dimensions may be 
brought together under the broader constructs of 
activation and control. The former refers in a 
general way to the amplitude or intensity of be- 


havior; the latter describes the efficiency with . 


which behavior is focused or directed. It is sug- 
gested that these two factors are important con- 
tributors to individual differences in the dura- 
tion of the spiral aftermovement. 


The Problem 


1. The proposed experiment was designed to 
evaluate the relationship of the aftermovement 
phenomenon to the concepts of activation and 
control Both psychophysiological and question- 
naire measures, of these constructs were em- 
ployed in the experiment. Initially, the subjects 
were divided into four questionnaire subgroups 
according to their anxiety and impulsivity scale 
scores. Subgroup aftermovement durations were 
then compared. Next, the subjects were reas- 
signed to different subgroups, according to their 
average heart rate and heart-rate variability 
measures. Again the subgroup aftermovement 


3“ANS Activity Related to Intra-individual Varia- 
bility,” final report to the National Institute of 
Mental Health, United States Public Health Service, 
Project M-4534, 1961. 
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durations were analyzed. It was held that indi- 
vidual differences in aftermovement duration are 
attributable to both the questionnaire and psy- 
chophysiological indices of the activation and 


.control constructs. Relationships between the 


questionnaire: and physiological measures were 
also investigated, 

2. Spigel (1961) studied the effects on the 
aftermovement of an interval of darkness im- 
mediately following the rotatión of the spiral 
and preceding the preséntation of the stationary 
ipage. He found that the aftermovement be- 
came shorter with longer intervals. He also 
showed that when a nonhomogeneous perceptual 
field was presented during the interpolated pe- 
riod (rather than darkness) the mean aftermove- 
ment was further reduced. In the present experi- 
ment both the nondelayed aftermovement and 
darkness interval aftermovement were evaluated. 
Tt was suggested that afferent feedback from the 


cardiovascular system may have affects similar 


to external stimulation. Thus, constant high or 
low heart rate might not greatly effect the 
length of the aftermovement. However, high 
cardiovascular variability during the darkness 
interval would act in the game way as a variable 
external field, and induce a shorter aftermove- 
ment. Paralleling this, hypothesis, impulsive sub- 
jects (who are held to have high resting cardiac 
variability) would be expected to show greater 
aftermovement reduction associated with the 
darkness interval than nonimpulsives. 


METHOD 
Subjects 


Personality measures. Sixty male undergraduates, 
chosen over a period of three semesters from the 
introductory psychology course at the University of 
Pittsburgh, participated in this experiment. They 
were selected on the basis of scores obtained on the 
Taylor (1953) MA scale and a 24-item impulsivity 
scale developed by Wilkinson (1962). The latter 
scale defines impulsivity as the inability to tolerate 
delay. The following are sample items: “I find my 
snap judgments tend to be good ones"; "I am 
rather cautious in my reactions to emexpected situ- 
ations”; “I tend to respond quickly to most situa- 
tions? The Wilkinson scale has a high degree of 
internal consistency—the biserial correlations range 
from 45 to .75 with a median coefficient of .57. 
Furthermore, little relationship exists between the 
Wilkinson and Taylor scales. A nonsignificant corre- 
lation of .14 was obtained from a random selection 
of introductory psychology students (N — 60), not 
employed in this experiment. v 

Subjects who had received scores falling within the 
extreme quadrants of both scales were assigned to 
four experimental groups (subgroup »=15): high 
anxious-high impulsive, high anxious-low impulsive, 
low anxious-high impulsive, low anxious-low im- 
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TABLE 1 


ANXIETY AND ĪMPULSIVITY MEAN SCORES AND RANGES 
FOR THE PERSONALITY INVENTORY SUBGROUPS 


Group 
1 2 3 4 

Anxiety 

M 24.53 23.80 8.60 8.40 

Range 22-28 21-28 4-11 3-11 
Impulsivity, 

M 18.27 6.07 17.53 5.20 

Range 16-21 1-8 16-22 2-8 


pulsive. Table 1 gives the ranges and mean scores of 
the four subgroups on the anxiety and impulsivity 
scales. An analysis of variance did not reveal signifi- 
cant differences between highs or between lows in 
differentesubgroups. . 

Cardiac activity scores. Subjects were assigned to 
new groups on the basis of their average heart rate 
and heart-rate variability. Average rate was deter- 
mined in the following manner: each subject’s cardiac 
activity was recorded for 15 minutes and a count was 
made of the total number of beats occurring during 
this period. From this data, each subject’s mean 
number of beats per minute was determined. The 
measure of heart variability employed was similar 
to that employed by Lacey.* For the last 5 minutes 
of the 15-minute rest interval, peak to trough heart 
rate differences were taken from the cardiotachometer 
record, and a frequency distribution of these dif- 
ferences charted for each subject. The median of 
the distribution was the subject’s variability score. 

Mean heart rate and median heart variability 
scores for all subjects were cast into two separate 
distributions. Subjects were then assigned to one of 
four groups, depending on whether they fell above 
or below the median on each of the two distribu- 
tions. Table 2 indicates the range, mean heart rate, 
and heart-rate variability of the four cardiac sub- 
groups. As before, analysis of variance revealed no 
significant differences between either highs or lows 
in different groups. 


Apparatus 


Heart rate was recorded by a Fels cardiotach- 
ometer with write-out on a Grass polygraph. The 
standard EKG lead II was employed, with electrode 
placements on the right wrist and left calf. Raw 
EKG, muscle potentials from the right forearm, 
GSR (Fels dermohmmeter), and respiration data 
were also written out on the Grass instrument. 

The stimulus employed in obtaining the after- 


4John I. Lacey described a similar method based 
on a 15-minute sample in a personal communica- 
tion, 1964. However, analysis of 8 records randomly 
selected from the 60 subjects used here, indicated 
that the 5-minute distributions were essentially the 
same as those obtained for the entire 15-minute rest 
interval. 
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TABLE 2 
Heart RATE AND VARIABILITY MEAN ScORES AND RANGES FOR THE CARDIAC SUBGROUPS 
Group A (n = 16) Group B (n = 14) Group C (n — 14) Group D (n = 16) 
& i 3 HR- Abo dian HR- 
Below median HR- | Below mean HV | ‘Selow median HV | above median HV 
Rat 
M 61.8 644 774° © 76.7 
Range 54.4-66.8 55.0-69.0 69.8-91.0 69.0-90.6 
Variability X 
M 4.83 8.83 5.54 9.64 
Range 3.0-7.2 7.4-12.0 4.0-7.3 ^" 7.57.0 


movement was a small circular disc 6 inches in 
diameter with a radial, spiral pattern printed on it. 
From a darkened room, the subject observed a 14 
inch, reverse projected image of the disc. In an 
adjacent apparatus cubicle, a Hurst 12 rpm syn- 
chronous motor with instantaneous stop rotated the 
pattern at a constant speed. The cubicle also con- 
tained the polygraph, a timing circuit with several 
switches which control the polygraph marking pen, 
the projector, rotation of the disc, and a Standard 
Electric timer. A subject switch was wired to the 
marking pen and also stopped the timer. 


Procedure 


The experiment consisted of three parts: (a) Five 
30-second presentations of the rotating spiral. At the 
end of each presentation the projector was turned 
off for .10 second. When the image was reprojected, 
it was physically stationary. Coincident with the 
reillumination of the disc, the timer was activated 
and the polygraph paper marked. When the subject 
no longer saw the aftermovement he pressed a button 
under his right hand, again marking the polygraph 
paper and also terminating the timer. For all of these 
presentations the mean lengths of the aftermove- 
ments (in 1/100s of a second) were calculated. (b) 
There was a 15-minute rest interval, during which 
time somatic activity was continuously recorded. 
(c) Five more trials were presented and the mean 
length of aftermovement was calculated. During these 
latter trials, rather than a momentary hiatus, a 
30-second darkness interval was interpolated between 
presentations of the rotating and stationary spiral. 

The procedure was the same for all subjects re- 
gardless of experimental group. The subject was 


TABLE 3 


Mean INITIAL AND FINAL TRIAL AFTERMOVEMENT 
DURATIONS FOR EACH OF THE PERSONALITY 
INVENTORY SUBGROUPS 


Initial Final 
Low High Low High 
anxious | anxious | anxious | anxious 
Low impulsivity | 18.90 27.40 | 18.15 | 24.50 
High impulsivity | 11.06 | 15.67 8.37 | 13.46 


* 


seated in a comfortable lounge chair and the ele 
trodes and respiration bellows were secured. The. 
lights in the experimental room were turned off and 
the rotating spiral was projected onto the screem 
approximately 4 feet directly in front of the subject. | 
Following a brief explanation, the subject was in: 
structed to fixate on the center of the disc, and 
administered one practice trial. The subject was then 
"told: 


In a few moments, the procedure will be re- 
peated. You are to do exactly as you did before. 
Remember, press the button when you can no 
longer perceive any movement. It is very impor- 
tant that you do not move your eyes but con- 
tinue to fixate on thé center metallic part of the 
disc, Except for the necessary movement of your - 
right hand, you are to remain as motionless as 
possible so that an accurate biological record may 
be obtained. Everyone sees the movement for 
different lengths—try to do your best. y 


After the five experimental trials, the screen was — 
closed. The room remained dark. The subject was 
told to relax, and for the next 15 minutes somatic 
activity was continuously recorded. At the end of A 
the 15-minute period, the screen was opened and 
the subject was told that the disc would again be 
presented, but between rotating and stationary. 
images the projector light would be off for a longi 
duration than before. Five experimental trials we 
then presented. The intertrial interval was again 2. 
minutes, Somatic activity was recorded continuously 
during the interpolated 30-second intervals of dar! 
ness, 


ù RESULTS 
Personality Inventory Subgroups 


1. Table 3 gives the mean aftermovement dura- f 
tions for each of the personality inventory sub- 
groups. The patterns of means are similar in 
initial and final trials. Although the initial du: 
tions seem to be larger than those of 
corresponding final trials, a ¢ test detected 
significant difference. 

The effects of impulsivity and anxiety on per 
ceived length of aftermovements were tested 
an analysis of variance (2 X 2 factorial design). © 
A summary of the results for the initial and final | 
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trial test periods is presented in Table 4. The 
differences in mean aftermovement duration were 
significant at the .01 level for the impulsivity 
variable in both the initial and final trials. Those 
subjects who, as a group, had low impulsivity 


'scores perceived aftermovement for a longer 


period of time than the high impulsives. 

The differences in mehif aftermovement dura- 
tion were significant at the .05 level for the 
anxiety variable in the initial trfal period. A simi- 
lar trend (though not*significant) was obtained 
for the final aftermovement trials. In neither 
analysis was a significant interaction obtained. 

2. F tests of the mean absolute change and 
percentage change (Spigel, 1961) from initial to 


.final trials, yielded no significant differences 


among the four personality scale subgroups. 
TABLE 4 


" ANALYSIS OF VARIANCE OF MEAN INITIAL AND FINAL 
AFTERMOVEMENTS FOR THE FOUR PERSONALITY © 
INVENTORY SUBGROUPS 


Anxiety (A) 1 | 643.94| 4.90* 

Impulsivity (B) | 1437.46 1 | 1437.46 | 10.93% 
B 1 56.41 EX! 

Within error 56 | 131.50 

Final 

A 1 | 490.50| 3.00 

B 1 |1625.00| 9.949% 

AXB 1 5.92} .36 

Within error 56 | 163.33 

* 

5528 


However, Spigel's data suggest that there is no 
substantial decay of the aftermovement unless 
the interpolated darkness interval is at least two 
.times the subject's mean aftermovement. Since 
fhe interpolated darkness interval was held con- 
stant at 30 seconds, it is possible that only sub- 
jects having initial mean aftermovements of 15 
seconds or less would show substantial decay. 
They alone experienced a darkness interval long 
enough to cause a significant shortening of the 
aftermovement. Hence it was reasoned that. if 
postulated differences between low and high 
impulsives’ percentage scores were to appear in 
the data, the differences would be most likely to 
occur for these particular subjects. In point of 
fact, the overall, average percentage of change 
for the above 15-seconds group was somewhat 
smaller (.01) than for the 15-seconds-or-less 
group (.08, N — 31). Furthermore, the mean 
percentage reduction for the high impulsives of 
the 15-seconds-or-less group was .21 and the 
mean percentage reduction for the low impul- 
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TABLE 5 


MEAN INITIAL AND FINAL AFTERMOVEMENT DURA- 
TIONS FOR THE Four CARDIAC SUBGROUPS 


Heart rate Aftermovement 
Group 
Variability | Average Initial Final 
B High Low 11.9 9.9 
D High High 23.3 20.3 
A Low Low 19.8 174 
Cc Low High 17.2 16.1 


Note.—Subjects were assigned to groups on the basis of the 
cardiac rate and variability measures as obtained during the 
15-minute rest interval. 


sives was —.27 (initial score and absolute change 
yielded an r of .66). A test of this difference in 
percentage of reduction resulted in a £ of 2.05, 
P €.09 (high- and,low-anxious subjects were 
about equally represented in these two impulsiv- 
ity subgroups). 

3. The analysis of variance (2 X2 factorial 
design) of differences in the mean heart rate of 
the four questionnaire subgroups (as recorded 
both during the last 5 minutes of the rest interval 
and the average of the 30-second interpolated 
darkness intervals) did not yield significant dif- 
ferences either for the impulsivity or anxiety 
scales. No interaction effect was obtained. Simi- 
lar analyses of differences in the mean heart 
variability scores of the four subgroups did not 
yield significant main effects or interactions. 


Cardiac Rate Subgroups 


1. The four subgroups were classified on the 
basis of the 15-minute rest interval, rate, and 
variability measures. Table 5 shows their initial 
and final mean movements, An analysis of vari- 
ance (2X2 factorial design, method of un- 
weighted means) of differences in the mean length 
of the initial aftermovement, yielded an inter- 
action effect between heart rate and heart varia- 
bility that was significant at the .05 level. Table 
6 presents a summary of this analysis. Although 


TABLE 6 


ANALYSIS OF VARIANCE OF MEAN INITIAL AFTER- 
MOVEMENTS FOR THE Four CARDIAC SUBGROUPS 


Heart rate (A) | 19.339 1 | 288.63 | 288.63 | 1.91 
HN variability| — .807 1 12.04| 12.04 

B 
A g B 49.808 1 | 743.38 | 743.38 | 4.92* 
Within error 56 |8461.51 | 151.10 

* f «.05. 
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TABLE 7 


Mean IMPULSIVITY AND ANXIETY TEST SCORES 
FOR THE Four CARDIAC SUBGROUPS 


Heart rate 
Group Impulsivity | Anxiety 
Variability | Average 
B High Low 14.29 16.14 
D High High 9.38 18.00 
A Low Low 11.38 14.31 
c Low High 1243 16.93 


no difference in mean length of initial after- 
movement was found between high and low 
heart-rate groups with low variability, a ¢ test 
revealed that the difference in lengths of mean 
aftermovement for the two high variability 
groups apparent in Table 5 was significant at the 
.005 level (t= 2.86). A similar analysis of the 
final aftermovement data did not prove signifi- 
cant, 

Furthermore, regrouping the subjects on the 
basis of average cardiac rate and variability for 
the 30-second darkness intervals, yielded no sig- 
nificant results.’ F tests of the differences in mean 
percentage change among the four cardiac sub- 
groups, defined both by the last 5 minutes of 
rest or the 30-second interpolated darkness in- 
tervals were not significant. 

2. Table 7 gives the mean anxiety and im- 
pulsivity test scores for each of the four cardiac 
subgroups. 

The analysis of variance (2 X 2 factorial de- 
sign, method of unweighted means) of differences 
in anxiety scores for the four subgroups did not 
yield any significant differences. However, in a 
similar F test of differences in impulsivity scores 
for the four subgroups, the probability of the 
obtained interaction (between cardiac rate and 
cardiac variability) was less than .08 (F — 3.23). 
No difference in impulsivity was found between 
high and low heart-rate groups with low varia- 
bility, However a ¢ test revealed that the differ- 
ence between high and low heart-rate groups 
with high variability apparent in Table 7, was 
significant at the .01 level (— 2.47). 


DISCUSSION 


These results indicate that high-impulsive sub- 
jects as a group, report shorter aftermovements 
than do low impulsives. It was further found 
that high-anxious subjects report longer after- 


5 Rate and variability were calculated for the five 
30-second darkness intervals, during the final after- 
movement trials. Based on these scores, subjects 
were reassigned to cardiac subgroups in the same 
manner as in the main heart-rate analysis. 
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movements than low-anxious subjects. Finally, 
these variables appear to assert effects additive 
to each other, since the longest average after- 
movement was seen in the high anxious-low im- 
pulsive subgroup and the shortest aftermove- 
ments were obtained in the low anxious-high im- 
pulsive subgroup. 

As measured here, impulsivity appeared to be 
the more influential variable. Low impulsives 
showed significantly shorter aftermovements 
than high impulsives on’ both initial and final 
trials. Furthermore, among impulsive subjects 


with short initial aftermovements, high impul- ~ 


sives showed a significantly greater percentage of 
reduction in aftermovement than low impulsives 
on the darkness interval trials. Aftermovement 
differences between the anxiety groups achieved a 
less stringent confidence, level (p < .05) on ini- 
tial trials and they were not significant on the 
second group of trials. Furthermore, no relation- 
ship was found between the Taylor scale and 
cardiac rate. 

The relationship between heart rate and the 
aftermovement is complex. Neither average rate 
nor variability may be considered separately, 
and furthermore, differences in the average rate 
of low-variable subjects were unrelated to the 
aftermovement. However, within high-variability 
subjects, rate did parallel aftermovement dura- 
tion in the manner suggested by the activation 
hypothesis, that is, subjects with a rapid pulse 


reported significantly longer aftermovement. The . 


variability-impulsivity relationship also proved 
to be more complex than was anticipated. Again 
no differences were found within low-variable 
subjects. However, variable, slow-pulsed indi- 
viduals tended to have high scores on the Wilkin- 
son impulsivity scale; variable, rapid-pulsed sub- 
jects generally achieved lower scores. 

The heart-rate interaction described above was 
unanticipated and will be explored further. It is 
important to point out that it was not one-sided: 
the low-variable subgroup means fell between 
those achieved by the two high-variable groups 
(see Table 5). Rate interacted with high varia- 
bility to produce aftermovement lengths at the 
extremes. A similar result was obtained for the 
Wilkinson scale. It will be interesting to discover 
if these effects can be obtained with other tasks. 

Whereas the impulsivity scale directly predicted 
both the initial and final afterimages, the cardiac 
rate interaction only related to the initial after- 
image. The physiological relationships are thus 
both more complex and less strong than those 
defined by at least one of the personality ques- 
tionnaires. Furthermore, when the other organ- 
ismic events recorded during this experiment 
were added to the cardiac measures, no improve- 


| 
| 


BRIEF ARTICLES 


ment in aftermovement prediction was obtained.5 
In part this result may be attributed to an ex- 
perimental design in which extremes were selected 
by questionnaire, while no special care was taken 
to assure disparity along physiological dimen- 
sions. A replication should either employ a 
large group of randomly selected subjects, or base 
subject participation on*cardiac activity. 

It is important to consider possible differences 
between reported and perceived aftermovement. 
It is not clear from these data whether impulsive 
subjects saw shorter aftermovements, or simply 
could not delay the button press, despite a con- 
tinuing aftermovement. No subject reported this 
phenomenon, and in a sense the question is not 
meaningful, that is, a subjective state cannot be 
measured independent of behavior. However, one 
may ask if there werg inconsistencies in the dif- 
ferent response systems involved. As one check 
on {this hypothesis, electrical potentials were 
recorded from the muscle group involved in the 
button press. It was suggested that if high im- 
pulsives were pressing before the aftermovement 
stopped, they would also show a greater number 
of partial, anticipatory presses than low im- 
pulsives. No significant, difference between these 
groups was observed, either in the number of 
EMG bursts or tension level during the prepress 
period. 

The general hypothesis that activation and 
control constructs are related to individual dif- 
ferences in the aftermovement received consid- 
erable support. Subjects who may be described 
as chronically anxious, or who characteristically 
find it difficult to delay responses, reported after- 
movement durations consistent with these char- 
acteristics, Furthermore, at least one autonomic 
response system bore a relationship to both the 
aftermovement and to the questionnaire esti- 


- mate of behavior control. Future work should be 


focused on the direct manipulation of these 
variables. At the somatic level, subjects could be 
trained to alter heart rate prior to aftermovement 
judgments. The technology is available: operant 
control of heart-rate acceleration (Shearn, 1962) 


6A multiple correlation analysis of the relationship 
between initial aftermovement and six physiological 
variables (heart rate, heart-rate variability, respira- 
tion rate, average conductance, number of GSRs, 
and bursts of muscle activity) was performed on the 
IBM 7090 computer at the University of Pittsburgh 
Data Processing Center. None of the obtained cor- 
relations approached an acceptable confidence level. 
Furthermore, the afterimage was evaluated for GSR 
level and variability groups in an analysis of vari- 
ance similar to that reported here for the cardiac 
subgroups. No significant main effects or interactions 
were obtained. 
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and heart-rate variability (Hnatiow & Lang, 
1965) have both been demonstrated. 

Cognitive factors and set also deserve more 
consideration. Mayer and Coons (1960) have 
shown that not only the length, but the presence 
or absence of the aftermovement phenomenon 
may be manipulated by instructions, and that 
these effects interact with pathology. In the 
present experiment the instructions suggested 
that all would see an aftermovement, and no 
valence was assigned to duration. A different 
preparatory set would undoubtedly alter results. 

The interaction between cognitive factors and 
physiological state is also of considerable im- 
portance. While Eysenck and his associates (Ey- 
senck & Holland, 1960) obtained reduced after- 
movement following the administration of so- 
dium amytal, longer aftermovements were not 
associated with the stimulant d-amphetamine. 
Schachter and Singer (1962) suggest that emo- 
tional states involve both a physiological state 
of arousal and an appropriate cognition. It 
would be useful to study the effects of ampheta- 
mine on the aftermovement when subjects are 
uninformed about the drug’s effects, and anxious 
cognitions are encouraged, as opposed to a con- 
dition in which the drug action is explained. The 
separate effects of anxiety and physiological 
arousal could thus be assessed. In summary, the 
present results suggest provocative new directions 
for the study of the relationship between per- 
ceptual, somatic, and verbal responses. 
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SOME RELATIONS BETWEEN MINIMAL CONTENT, ACQUTESCENT- 


DISSENTIENT, AND SOCIAL DESIRABILITY SCALES! 


DANIEL B. CRUSE? 


University of Miami 


68 Ss were administered 5 scales: a complete nonsense scale, a partial nonsense 
scale, K scored for social desirability (SD), a 40-item SD scale, and Edwards’ 
39-item SD scale. Both minimal content scales were highly reliable and inter- 
correlated, showing a consistent response tendency; the SD scales were sig- 
nificantly correlated, showing another consistent response tendency; the 
minimal content response scales were not significantly related to SD scales, 
showing the influence of meaningful content. In a 2nd study 150 Ss were 
randomly assigned to 1 of 5 groups. All groups took 4 scales: a partial non- 
sense scale, A, R, and the 39-item SD from the MMPI. The Sth scale was 
different for each group and was composed of 0, 25, 50, 75, or 100% neutral 
SD items. Changes in the percentage of neutral SD items from 0 through 100 
for the 5 scales was associated with a decrease in the mean SD scores, a de- 
crease in Kuder-Richardson Formula 21 coefficients based un SD keying, and 
a decrease in the magnitude of correlations with SD relevant scales. Acquiescence 
scales and indexes tended to show changes opposite to SD scales as the 
percentage of neutral items in scales increased. The partial nonsense scale was 
highly reliable and not generally correlated with SD-keyed scales. 


The response set of acquiescence has been em- 
phasized by several investigators in interpreting 
personality-type tests such as the MMPI (Fricke, 
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1957; Jackson & Messick, 1958; McGee, 1962; 
Wiggins, 1959). Edwards (1957, 1961) has, on 
the other hand, emphasized the social desirability 
value of statements in accounting for the vari- 
ance in personality tests, rather than acquiescence. 
The purpose of Experiment I is to investigate 
several relations between acquiescence and social 
desirability. 
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TABLE 1 
INTERCORRELATIONS BETWEEN NONSENSE ScALES, RELIABILITY COEFFICIENTS, 
MEANS, AND STANDARD DEVIATIONS 
Ld 
Scale PN SDo SDr SD» M Se 
2 ele Spgarman- | Ra 
CN .73* y —.05 07 .68s* .85 48.9 12.6 
PN E —.02 .08 .T3b* 76 26.2 7.0 
SDao x 39% .59* AS 31.7 3.4 
SDr .60* .65 18.3 43 
SDis .66 31.7 4.1 


a Corrected for twice the length by the Spearman-Brown formu! 


b Corrected for four times the length by the Spearman-Brown formula. 


* p «.05. 


Cronbach (1946) pointed out that response 
sets might be studigd by minimizing the word 
content in the usual test by providing nonsense 
items. If the text of a nonsense item does not 
directly affect a particular response, then the re- 
sponse made to that item may be due to other 
factors, one of which could be a tendency to say 
“Yes,” to acquiesce, or to say “No,” to dissent. 
In the first of two studies two nonsense-type 
scales were developed, and presented with per- 
sonality scales in order to assess the relation be- 
tween the response continuum of acquiescence- 
dissentience and the tendency to give socially de- 
sirable responses. 


EXPERIMENT I 
Method 


Sixty-eight students in undergraduate psychology 
classes were administered five scales in consecutive 
order. The first scale was composed of 100 items in 
Arabic (Harrell & Blanc, 1960). These items were 
generated by taking consecutive sets of eight words, 
without stress marks, from the Arabic text, capital- 
izing the first word and placing a period after the 
eighth word. 

The second nonsense-type scale was composed of 
50 items. This second scale was developed by se- 
lecting a page at random from Hilden's (1958) Uni- 
verse of Personal Constructs and beginning each 
item with the first two wordsein Hilden's list and 
finishing the item with one Arabic word for each 
English word, On the nonsense-type scales the 
subjects were instructed to answer Yes or No as they 
intuited or guessed agreement or disagreement with 
the items, The number of Ves responses were scored. 
The third scale was the K scale from the MMPI, 
scored for social desirability. The scale values of 
Messick and Jackson (1961) were used. The fourth 
scale was a 40-item social desirability (SDs) scale 
with items distributed over the social desirability 
continuum (Cruse, 1963). One-half the items in the 
SD test were scored Yes and one-half scored No. 
The fifth scale was Edwards’ 39-item SD (SDs) scale 
(Edwards, 1957). 

For all scales the format of items and answers 


was such that the Yes-No alternatives preceded the 
item. The position of the Yes-No alternatives was 
randomly ordered>to minimize any position pref- 
erence for the left or right. 


Results and Discussion 


Table 1 presents the product-moment correla- 
tions between scales, reliability coefficients, means, 
and standard deviations. Both the Spearman- 
Brown and the Kuder-Richardson Formula 21 
(K-Rg;) coefficients for the nonsense scales indi- 
cate that these scales are reliable. The scale com- 
posed of 100 complete nonsense (CN) items in 
Arabic correlates significantly with the scale com- 
posed of 50 partial nonsense (PN) items. If an 
aquiescent-dissentient dimension refers to a reli- 
able tendency to respond Yes or No to items 
with little or no content, then one may suppose 
these nonsense scales are adequate measures of 
this dimension. The definition and assessment of 
acquiescence and dissentience is, however, not 
clear. 

It has been pointed out that there are several 
aspects to the notion of acquiescence (Edwards, 
1961; Jackson & Messick, 1958; Wiggins, 1962). 
One aspect of acquiescence involves a general 
tendency on the part of the subject to agree with 
an item when no issue is apparent. Issues and 
specific content may be kept at a minimum by 
requiring heterogeneous, obscure, or difficult 
items. Items with no discernible theme or special 
content presumably result in tests with low inter- 
nal reliability because the subjects respond in an 
inconsistent, heterogeneous fashion and thereby 
produce low interitem correlations. Consistent re- 
sponding resulting in a high reliability coefficient 
is not desired because it would imply some con- 
tent consistency and content control, rather than 
nonspecific items permitting an unhindered ex- 
hibition of acquiescent tendencies. 

A second aspect of acquiescence requires high 
reliability coefficients as an indication of an inter- 
nally consistent trait of acquiescence (Couch & 
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Keniston, 1960; Gage, Leavitt, & Stone, 1957). 
High reliability coefficients are desired because 
they show accurate measurement relative to the 
trait in question. 

These two aspects of the acquiescence notion 
lead to a contradiction. The first aspect of acqui- 
escence requires dissimilar and obscure content 
resulting in low reliability coefficients, but the 
second aspect of acquiescence requires high reli- 
ability coefficients. 

The nonsense scales constructed in this study 
fulfill the requirement of obscure content, but 
not the low reliability requirement, demanded by 
the first aspect of acquiescence. The high reli- 
ability coefficients of the nonsense scales show 
sufficient reliable variance to vitiate the low reli- 
ability coefficient requirement implied by hetero- 
geneous, obscure, or difficult items. The reliability 
coefficients of the nonsense scales indicate ‘that 
reasonably high reliability coefficients may be 
obtained when no issue is obviously apparent in 
the item content. 

Content in the sense of a discriminative con- 
trol by English words is not involved in the CN 
scale. Content in the sense of discriminative con- 
trol by the scale and response format is an obvi- 
ous and necessary component of all paper-and- 
pencil scales. If scales in Arabic provide the high 
internal consistency shown, one might question 
any attempt to resolve the acquiescence problem 
with an indirect approach of relying upon reli- 
ability coefficients as a guide to dissimilar and 
obscure content. Some direct criteria of dissimilar 
and obscure content independent of response fre- 
quency may be necessary. 

Fritz (1927) and Cronbach (1946) found a 
tendency for subjects to endorse a larger propor- 
tion of items than were rejected. As shown in 
Table 1, the mean number of items endorsed on 
the CN scale indicates no clear tendency for col- 
lege subjects to endorse larger proportions of 


TABLE 2 


COMPOSITION OF EXPERIMENTAL SCALES VARYING 
IN THE PERCENTAGE OF NEUTRAL ITEMS 


Social desirability continuum 


Percent- | Undesirable Neutral Desirable 
age 
1.50 2bo 4.30 slo 5.70 zho sso 
0 16|16| 0| 0]| 16 | 16 
25 12|12| 8| 8|12| 12 
50 8 | 8 Sy ESS 
75 4| 4] 22] 26] 4| 4 
100 0| 0} 27} 37} 0| 0 
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nonsense items than were rejected. The mean 
number of endorsements for the CN scale was not 
different from an expected mean of 50 (t= .71, 
p> .05). The mean number of items endorsed 
on the PN scale was also not statistically differ- 
ent from an expected 25 (¢=1.41, p> .05). 
Table 1 shows neither of the nonsense scales 
correlate significantly with the social desirability 
scales, although each social desirability scale cor- 
related significantly with other social desirability 
scales. The nonsignificant correlations reported 
in Table 1 between the nonsense scales and the 
social desirability scales do not support the 
notion of an acquiescent-dissentient dimension 
being correlated with scales scored for social de- 
sirability. The nonsignificant correlations between 
the nonsense and social desirability scales are 
presumably due to the effects of meaningful, as 
opposed to nonsense, content upon scale scores. 


s 


EXPERIMENT II 


Experiment I indicated that scales with mini- 
mal content are reliable, intercorrelate highly, but 
do not correlate significantly with scales scored 
for social desirability. The purpose of Experi- 
ment II was to further investigate the relation 
between nonsense scales with minimal content 
and scales varying in the proportion of items in 
the neutral range of the social desirability con- 
tinuum and investigate the relation between sev- 
eral personality scales and social desirability 
scales varying in the proportion of neutral items. 

It has been suggested that correlations of 
personality-type scales with social desirability 
scales decrease as the proportion of neutral items 
in the scale increase (Edwards, 1957; Hanley, 
1956). This hypothesis was tested by construct- 
ing scales with varying proportions of neutral 
items and correlating them with several other 
personality scales. 


Method 


Five experimental social desirability scales were 
constructed from a 1647-item pool previously scaled 
for social desirability (Cruse, 1965). A linear re- 
gression line was fitted to the regression of the 
Írequency of endorsing an item on the item's social 
desirability scale value. All items falling within one 
standard deviation above or below the regression of 
Írequency of endorsement on social desirability scale 
values were used as the parent population. The social 
desirability scale was divided into five equally spaced 
categories and items randomly chosen from these 
categories. 

Table 2 shows the composition of the scales con- 
taining 0, 25, 50, 75, and 100% neutral items. Each 
scale contains 64 items. Items within each of the 
five categories were randomly drawn. Random se- 
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TABLE 3 


CORRELATIONS, HOMOGENEITY TESTS, AND AVERAGE r 
FOR THE PARTIAL NONSENSE SCALE 


Percentage neutral 


PN. x r 
; 0 | 25.| so | 75 | 100 
A e|-—03| .20] lo} .13| —32| 4.89] .02 
SD» 13) —24| 4| —04 | .43| 612 | —.13 
R —133 | —46| —14| <00 | —14 | 4.84 | —20* 
BDexy.| 17| —24| —10| —16| .14| 3.65 | —.04 
444] —04 | .34 | .25 | —.08 | 11.484 


+ MesEsp. 


* «.05. 


lection resulted in a small imbalance, within the 
neutral range, in the number of items above or below 
the neutral point. . 

In addition to these experimental scales another 
partial nonsense scale (PN) was developed. This 
scale contained 100 ittms and was made by ran- 
domly selecting a page from Hilden's (1958) Per- 
sonal Concept set and beginning the item with the 
first two words from Hilden's set and finishing the 
item with one Arabic word for each English word. 
Items were selected so that no two items began 
- with the same two English words. The subjects were 
instructed to guess or intuit the answers. The num- 
ber of Yes responses were scored. 

The subjects for the study were volunteers from 
undergraduate psychology classes. Subjects number- 
ing 150 were randorfly assigned to one of five groups. 
Each group took one of the experimental scales 
along with the partial nonsense scale, the R, A4, and 
Edwards SD» scale from the MMPI (Dahlstrom & 
Welsh, 1960). The scales were administered indi- 
vidually and the order of taking was PN, R, A, SDs, 
and experimental scale. 


Results 


Five independent groups of 30 subjects took 
the PN scale of 100 items. No significant differ- 
ences were found between the five means of the 
PN scale (F = 145, df = 4/145, p > .05). A dif- 
ference was found between the combined PN 
means of 55.45 and the expected mean of 50, 
assuming no response preference on the PN scale. 
A test of the combined independent groups was 
significant (y? = 31.45, df—"0, p < 05) and 
shows a tendency for the subjects to endorse 
more items than were rejected. The average 2’ 
Spearman-Brown correlation was .91 and the 
average z/ K-Ro, correlation was .90. These large 
reliability coefficients corroborate the previously 
obtained high reliability coefficients in Experi- 
ment I for PN scales. 

Table 3 shows the correlations between the PN 
scale and A, R, SDso, the five experimental scales 
scored for social desirability (SDgxp,) and the 
five experimental scales scored for the number 
of Yes responses (Yesg,). A chi-square test 
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for homogeneity of correlations for the five inde- 
pendent samples showed the correlations between 
PN and A, R, SDss, and SDg,, to be within 
acceptable limits and therefore considered as esti- 
mates of the same population value. The PN 
scale did not correlate significantly with the three 
social desirability relevant scales, A, SDgg, and 
SDpxp, The lack of a significant correlation 
between PN and A, SD, and SDg,,, lends sup- 
port to the notion that measures of social desira- 
bility are not heavily influenced by acquiescent- 
dissentient effects. 

The R scale is interpreted as the polar opposite 
of acquiescence (Edwards, Diers, & Walker, 1962; 
Jackson & Messick, 1962; Wiggins, 1962). The 
significant negative correlations between PN and 
R shown in Table 3 support the notion of the 
PN scale as measuring acquiescent-dissentient re- 
sponse tendencies, because subjects with rela- 
tively high False scores on R tend to give a rela- 
tively high number of No responses on the PN 
scale. 

The chi-square of 11.48 in Table 3 indicates 
that the five correlations between PN and the 
Yespxp, scales may not be considered estimates 
of the same population value. These correlations 
show a relation between the acquiescent- 
dissentient responses on the PN items and on 
personality-type items balanced for social desira- 
bility and the number of Yes responses. There 
does not appear to be any clear or consistent rela- 
tion between PN and Yespxp, since the percent- 
age of neutral items and correlations differ for 
the five groups and show no consistent trend, It 
is important to note that there is a significant 
correlation between PN and the 0% neutral 
Vesp, scale, indicating the possibility of ob- 
taining a positive relationship between an 
acquiescent-dissentient scale and a scale balanced 
for Yes-No responses containing items on the 
extreme ends of the social desirability continuum. 

Table 4 shows the means and variances of the 
five experimental scales when scored for social 
desirability and for the number of Yes responses. 


TABLE 4 


MEANS AND VARIANCES OF THE EXPERIMENTAL SCALES 
SCORED FOR SOCIAL DESIRABILITY AND 
NuxmER "Yrs" 


Percentage neutral 


0 75 100 
SDexp. M 50.80 | 48.60 | 43.50 | 40.70 | 34.67 
SD» variances | 90.25 | 40.70 | 51.84 | 22.56 | 23.04 
Yesrxp, M 31.73 | 31.60 | 28.77 | 31.90 | 31.60 
Yess, variance | 20.16 | 30.14 | 34.46 | 47.61 | 65.29 
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TABLE 5 


RELIABILITY COEFFICIENTS, INTERCORRELATIONS, 
AND HOMOGENEITY TESTS FOR THE 
EXPERIMENTAL SCALES 


Percentage neutral 
x 
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geneous set of correlations between SDy,, scales 
and the A scale. 

Figure 1 displays the correlations between Ed- 
wards’ SDsg scale and the SDy,, scales, A test 
of the homogeneity of independent correlations 
showed these correlations to be nonhomogeneous 


K-Ra, 89) .72| .73| .33| 29 
S. 


Dgxy. 

K-Ra, 21} 48} .55| .58| .77 
Yesgxy. 

SDzxy.-A —.19| —.76 | —.72 | —.37 | —.43]10.05* 

SDaxp-R 08) .37| .34| —.08 | —.42,12.10* 
* p «.05, 


Hartley's test indicates that the hypothesis of 
equal variances may be rejected for both the 
SDyxp, and Yes, scales at the .05 level? One 
may also note the positive relation between the 
means and variances for the SDyxp, scales. 

An analysis of variance of the mean scores of 
the five Vesp, scales indicated no significant 
differences (F = 1.33, df = 4/145, p > .05). The 
comparability of the mean Yes scores shows the 
composition of the items to be within expected 
limits, that is, about one-half Yes and one-half 
No responses. An analysis of variance of the 
mean scores for the SDg., scales shows a sig- 
nificant difference in the mean scores (F — 24.30, 
df =4/145, p<.05), and a significant linear 
component when orthogonal components are 
tested (F = 95.30, df = 1/145, p< .05). Com- 
parable mean Yes scores for the five Yesgsy. 
scales and linearly increasing SD scores for the 
SDy,, scales indicate that the five scales have 
the desired balance in the mean number of Yes 
responses and the desired linear increase in the 
number of socially desirable responses as the 
percentage of neutral items increases. 

Table 5 shows the K-R» reliability coefficients 
for the five experimental scales when scored for 
SD and for Yes, One may note the decrease in 
the K-Rs, coefficients, from .89 to .29, for SD 
Scores as the percentage of neutral items in- 
creases. The decreasing K-Ro, coefficients indi- 
cate decreasing contributions of social desira- 
bility variance for the scale relative to the total 
variance of the scale. The increasing K-R, co- 
efficients for the Yes scoring, from .21 to .77, 
as the percentage of neutral items increases indi- 
cates an increasing contribution of Yes score 
variance for the scale relative to the total vari- 
ance of the scale. Table 5 also shows a non- 
homogeneous set of correlations between the 
SDyxp, scales and the R scale, and a nonhomo- 


(X? = 14.02, df—4, p<.05). Hanley (1956) 
and Edwards (1957) have both suggested a de- 
crease in correlation between a social desirability 
scale, such as the-SDgg scale, and scales with 
increasing proportions of neutral items. This pre- 
diction is substantiated by the correlations be- 
tween SDss and the SDg,, scales in Figure 1. 
One may note the decrease in correlation from 
.83. to .31 as the percentage of neutral items 
increases from 0% to 100%. 

The experimental scales were separately 
scored for Yes responses to items with desirable 
scale values (SD-Yesg,, )*and No responses to 
items with undesirable scale values (SD-Nogs,.). 
The correlations between SDss and SD-Yeipxy, 
are nonhomogeneous (x? = 8.51, df = 4, p < .05), 
as are the correlations between SDgg and SD- 
Nopy, (x?=9.95, df=4, p<.05). The cor- 
relations between SD-Yesg,, and SD-Nogsp, 
are also shown in Figure 1. The correlations 
between SD-Yesg,, and SD-Nog,, are non- 
homogeneous (x? = 27.80, df = 4, p < .05). 

The relatively high positive correlation of .64 
between SD-Yesg,, and SD-Nog,, for the 0% 
neutral scale supports a social desirability inter- 
pretation of these scales since there is a strong 
tendency for subjects to give socially desirable 


COEFFICIENTS 


M 1395D4- EXP.SD 
A385D- EXP. SD-NO 

E395D4- EXP SD- YES 

SEXPSD-NO+EXPSD-YES 


CORRELATION 


be i 5 o 3 5 wm 
PERCENTAGE ‘NEUTRAL 
Fic. 1. Correlations for SDs and experimental scales. 
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responses to both desirable and undesirable items 

with nonneutral scale values. The change in cor- 

relation between the SD-Yesg,, and SD-Nogs;, 
scales from .64 to —.48 also supports the sug- 
gestions of both Hanley (1956) and Edwards 

(1957) that scales varying in the proportion of 

neutral items may be differentially sensitive to 

social desirability effects, 

This change in correlation, shown in Figure 1, 
. from .64 to —.48 for SD-Yespxp, and SD-Nogyy, 
scales may also be geen as a change in relation 
between these two scales. The correlation of .64 
shows a positive relation between endorsing de- 
sirable items and rejecting undesirable items, and 
thereby supports a social desirable interpretation 
of the 0% neutral scales. The —.48 correlation 
shows a negative relation between endorsing de- 
sirable items and rejecting undesirable items, and 
thereby supports an dcquiescence-dissentience in- 
terpretation of the 100% neutral scale. The 
supported interpretation clearly depends on the 
percentage of neutral items in the scales. 

The correlation of Edwards’ SDss scale with 
the SD-Yespxp, and SD-Nog,, scales is also 
shown in Figure 1. The $D3s scale does not show 
the same pattern of correlations with SD-Yespx», 
as it does with SD-Ndpxp, The SDsg scale cor- 
relates from a moderate .74 to .49 with the 
SD-Nogs,. scale afid from .76 to —.14 with the 
SD-Yespsp, scale. This difference in correlation 
trends between the scales suggests that the SDgg 
scale is differentially influenced by undesirable 
items as compared to desirable items. 


Discussion 


As in Experiment I, reliability coefficients for 
the partial nonsense scales show high reliability 
coefficients, If acquiescence is taken to be related 
to a tendency to endorse items with minimal, 
-` irrelevant, and obscure content, the PN scale 
may be said to be a reliable indicator of this 
tendency, However the PN scale is interpreted, 
it does not correlate significantly with any of the 
scales keyed for or representative of social 
desirability factors. ^ 4 

Although the K-Ro, coefficient may be mis- 
leading as a sign of trait variance in nonsense 
scales, it can be helpful when item parameters 
are independently specified. The decrease in 
K-R,, coefficients for: the SDpxp, keying as the 
proportion of neutral items increases stands 
in sharp contrast to the increase in K-Ro; CO- 
efficients for the YeSpxp, keying as the propor- 
tion of neutral items increases, The inverse rela- 
tion between these K-Ro, coefficients lends 
credence to the effectiveness of manipulating 
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item parameters in terms of social desirability 
operations when the degree of interitem cor- 
relation needs to be controlled. 

Decreases in the K-Rg, coefficients as the pro- 
portion of neutral items increases may also be 
seen in several other studies. Jackson and Mes- 
sick (1962) found that scales composed of items 
taken from the extremes of the social desirability 
continuum showed the largest K-Ro; coefficients, 
while scales composed of items taken from the 
neutral range showed smaller K-Ro; coefficients. 
Edwards, Walsh, and Diers (1963) obtained a 
negative correlation between K-Ro; coefficients 
and the proportion of neutral items in a scale. 
As the proportion of neutral items increased, 
the K-Ro; coefficients decreased. 

The noted change in correlation between the 
SD-Yespxp, and SD-Nogyp, scales from a positive 
value (.64) to a negative value (—.48) results 
in two different interpretations of social desira- 
bility derived scales. The correlation of .64 shows 
a positive relation between endorsing desirable 
items and rejecting undesirable items, and 
thereby supports a social desirability interpreta- 
tion. The correlation of —.48 shows a negative 
relation between endorsing desirable items and 
rejecting undesirable items, and thereby supports 
an acquiescence-dissentience interpretation, 

While the shift in correlation from a positive 
value to a negative value supports a social desir- 
ability interpretation of personality scales in 
general, since the scales were constructed with 
social desirability principles, it also suggests 
one variety of an acquiescent-dissentient hy- 
pothesis. The supported acquiescence-dissentience 
hypothesis emphasizes that for personality-type 
scales balanced for Yes and No items, there is 
an inverse relation between the tendency to en- 
dorse and reject items in the neutral range of 
the social desirability continuum, and that this 
inverse relation decreases and becomes positive 
as the proportion of neutral items in the scale 
decreases. These acquiescent-dissentient effects 
are not independent of content and they do not 
occur regardless of content since the change in 
correlation between these scales varies with item 
content as specified by social desirability scaling 
and keying. 

The SDss scale may be interpreted as a pure 
measure of social desirability and therefore not 
affected by acquiescence-dissentience (Edwards 
et al., 1962). This would seem to imply that the 
pattern of correlations of S39 with social desir- 
ability items keyed No would be similar to social 
desirability items keyed Yes. The correlations 
shown in Figure 1 between SDss and SD-Yespxp. 
and SD-Noy,,. suggest that this is not the case 
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and that the SD scale is differentially influ- 
enced by undesirable items. 

One may note that some imbalance occurs 
in the number of undesirable and desirable items 
in the experimental scales, as shown in Table 2. 
Although the correlations for these scales may 
be somewhat affected because of this imbalance, 
it appears unlikely that the imbalance would 
seriously alter the direction and magnitude of 
the reported correlations. 

Effects comparable to those shown in Figure 1 
for SDgg and the 100% neutral SD-Nog,,, and 
100% SD-Yesp;,, scales were found by Diers 
(1964). A social desirability scale composed of 
undesirable items correlated .43 with a neutral 
scale keyed false, and the same social desirability 
scale correlated —.48 with a neutral scale keyed 
true (Diers, 1964). Edwards and Diers (1963) 
also report negative correlations between the 
SDgg scale and neutral scales scored for the 
number of Yes responses. 

The difference in patterns of correlation be- 
tween SDs and the experimental scales may be 
interpreted as due to an imbalance in the SDgg 
Scale, or due to the influence in neutral scales 
of social desirability factors. Edwards and Diers 
(1963) have chosen the alternative which em- 
phasizes the influence in neutral scales of social 
desirability factors. They show that the point at 
which the probability of a True response equals 
the probability of a False response is approxi- 
mately between 5.7 and 5.8 on the social desir- 
ability continuum, not at the social desirability 
continuum neutral point of 5.0. A scale was con- 
structed of items close to the 5.75 point on the 
social desirability continuum and was found to 
have the low correlation of .088 with the SDss 
scale. The interpretation given by Edwards and 
Diers (1963) that neutral items are not inde- 
pendent of social desirability effects would be 
appropriate for the population of items from 
which the experimental scales were drawn (Cruse, 
1965). The point of equal probability for Yes 
and No responses is equal to 5.14, a point slightly 
on the desirability side of the neutral point on 
the social desirability continuum. 

While the SDzp scale may be taken as rela- 
tively pure and used as a reference point, it 
is also possible to take the neutral category on 
the social desirability continuum as a constant 
and interpret the tendency for subjects to give 
slightly more No or False responses and less Ves 
or True responses to items at the neutral point 
as an indication of dissentient responses to neu- 
tral range items. For this interpretation of the 
items from which the experimental scales were 
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drawn, the probability of a neutral item being 
endorsed is .48. 

The comparability of social desirability scale 
values across different judging groups (Edwards, 
1957) suggest that the operation of scaling may 
be more general, invariant, and essential to the 
notion of social desirability than any particular 
scale. The generality of the scaling techniques 
would recommend the neutral category as a 
more serviceable and invariant standard than 


"particular scales. . 
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AN EXPERIMENTAL AND THEORETICAL NOTE ON “CONSCIOUS AND 
PRECONSCIOUS INFLUENCES ON RECALL” * 


LEONARD WORELL anp JUDITH WORELL 
Oklahoma State University 


As an outgrowth of psychoanalytic conceptions, Spence proposed in previous 
work that “conscious, preconscious, and unconscious” conceptions are useful 
predictors of verbal recall and evidence was presented purporting to be con- 
sistent with these hypotheses. Since the present authors believed that these 
speculations were unnecessarily complicated and that Spence’s research lacked 
certain critical controls, a single variable hypothesis of associative clustering 
was presented and 4 studies were conducted. Employing associates, control, 
and buffer words together with varying critical stimuli, the principal findings 
were (a) before a critical stimulus was recalled, significantly more associates 
than control words were recalled whether or not there was any relationship of 
the critical stimulus to associates, while after the critical stimulus was recalled 
associates did not differ significantly in recall over nonassociates, (b) Early 
recallers, late recallers, and nonrecallers of any critical stimulus consistently 
recalled more associates than control words. These findings were neither con- 
sistent with Spence’s previous results nor his trivariable “consciousness” schema, 
but they did support the associative cluster proposal. 


* In a recent paper Spence (1964) presented 
a rather intricate hypothethical account of 
“conscious” and “preconscious” and apparently 
"unconscious" influences on the recall of verbal 
stimuli. On the supposition that these assumed 
states follow different “laws,” he proposed that 
when a stimulus is preconscious, that is, not in 
“conscious awareness,” its effect would be to 
“fan out” and exert a “silent influence” over a 
wide network of associations; once the stimulus 
became conscious, that is, apparently verbalized, 
associations would now follow a “single channel 


1 This research was supported by Grant M-4891 
from the National Institute of Mental Health, 
United States Public Health Service. The authors 
gratefully acknowledge the assistance of Michael 
Campbell. 


principle—we think of one thing at a time”; 
and “forgotten” stimuli should have “no reason 
to... freely influence the preconscious net- 
work of associates.” 

With the foregoing reasoning two identical 
“cheese” experiments were performed by Spence 
where subjects were presented serially with 27 
words which they were required to recall—these 
consisted of 6 buffer words, 10 associates to 
CHEESE, and 10 unrelated words. The critical 
word CHEESE was imbedded in the center (actu- 
ally right of center) of the list. 

The principal findings were: 


1. Before cheese was recalled a significantly 
larger number of associates than control words 
was recalled. This was interpreted as supporting 
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the notion of the silent influence of the pre- 
conscious word CHEESE on associates. 

2. Early recallers of cHEESE did not differ 
reliably in their recall of associates and control 
words whereas late recallers elicited significantly 
more associates. This was viewed as consistent 
with the proposal that later recall of CHEESE 
should allow for more "associative diffusion." 

3. After cheese was recalled and when cheese 
was forgotten, an equal number of associates 
and control words was recalled. These were 
interpreted as confirming, respectively, the views 
that “consciousness” operated on a “single- 
Channel principle" and that "defensive processes 
inhibit associative fan out." 


While it would be difficult to deny that the 
language of the foregoing hypothetical structure 
has a strongly romantic appeal, our analysis of 
the methodology and results suggested a simpler 
formulation as well as the need for several 
critical control experiments. In the following all 
data will be approached by the single assump- 
tion of associative clustering.? According to this 
assumption, when an articulate human adult is 
confronted with the problem of recalling a series 
of words, he will attempt to organize these 
words into one or more clusters of related items. 
In his paper Spence proposed that the 10 associ- 
ates to CHEESE were not related to one another 
because when single associations were obtained 
to each separate associate there was almost no 
tendency for one associate to produce another. 
We would argue, however, that this finding has 
little relevance to the situation where a person 
is presented with these associates as a group. 
Thus, we assume that the likelihood is greater 
for any 10 associates to an eleventh word to 
have a higher interitem associative strength than 
would another 10 words picked at random. It is, 
of course, not unreasonable to expect further 
that when a person is presented with both associ- 
ates to a particular stimulus and control words, 
he may use this critical stimulus to cluster the 
associates or he may impose one or more per- 
sonal clusters on the associates. In any case, 
by virtue of this associative clustering, there 
should be a stronger tendency to recall more 
associates and to recall them earlier than control 
words regardless of whether the critical stimulus 
(in the previous case CHEESE) is recalled, re- 
called early or not even present in the original 
list. To examine these considerations, four 
experiments were conducted. 


?For a more detailed coverage of various ap- 
proaches to associative clustering, see Bousfield 
(1953) and Deese (1959); also see Discussion. 
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METHOD 


Since the findings were with one exception the 
same in all experiments, the four studies will be 
presented together. In each study subjects were 
presented with 6 buffer words, 10 associates to either 
ANGER Or CHEESE, and 10 unrelated words matched 
for Thorndike-Lorge (1944) frequency. CHEESE asso- 
ciates, control, and buffer words were those used by 
Spence (1964); ANGER associates were obtained from 
Russell and Jenkins (1954). All subjects were told 
that 27 words would be presented orally and that 
they would be asked to reproduce them from 
memory. Words were read at the rate of 1 per 
second. After the completion of each list subjects 
were asked to write down all the words they could 
remember in any order they wished. They were 
also told that they could take as much time as they 
wished and that they could guess. The basic varia- 
tions between experiments consisted of either or 


both differences in the critical stimulus and the ^ 


10 associates. 


Study 1 


A total of 134 subjects in an undergraduate 
course in adjustment was employed. Instead of 
CHEESE and associates to CHEESE, ANGER and associ- 
ates to ANGER were substituted. Here, as in all the 
following studies, the critical word was in the 
fourteenth rather than the fifteenth position as was 
the case in Spence's study. The following words 
were read: street, bird, dime, flag, Loup, arch, HATE, 
SCORN, chair, FEAR, best, GRIEF, gift, ANGER, frail, 
HURT, sand, RAGE, hedge, TEETH, FURY, mirror, 
HARSH, trunk, chin, number, joke (ANGER associates 
are capitalized). 
ANGER associates ranged from 83 to 1 per 1,008 
responses (Russell & Jenkins, 1954). 


Study 2 


This was a duplication of Study 1 with the sole 
exception that CHEESE was substituted for ANGEm in 
the list. The ANGER associates were retained. Seventy- 
nine introductory psychology students were used 
here. 


Study 3 


Spence’s original list containing the curse associ- 
ates was used. The modification consisted of substi- 
tuting ANGER for CHEESE as the critical stimulus. 
Consequently, 112 introductory students received 
ANGER with CHEESE associates, 


Study 4 


Since it was possible that the emotional tone of 
the stimulus ANGER might have a different effect 
from a more neutrally toned stimulus, this final 
study duplicated the previous one with the modifica- 
tion that the critical stimulus suit was substituted 
for ANGER. The word suet has the same Thorndike- 
Lorge frequency as cHEEsE. In this study, 108 sub- 
jects from introductory psychology received sHIRT 
and the CHEESE associates. 


Associative frequencies for the | 


ae 
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TABLE 1 


MEAN RECALL or THREE CLASSES OF WORDS IN Four STUDIES BEFORE AND AFTER 
, THE RECALL OF THE CRITICAL STIMULUS 


Study 


1. ANGER with 
ANGER associates 
. CHEESE with 
ANGER associates 
. ANGER with 
CHEESE associates 
. SHIRT with 
CHEESE associates 


Pe u n 


RESULTS 


The findings will be presented in two major 
sections: Data on the “recallers of the critical 


stimulus and data on the nonrecallers ("for- 


getters’) of the critical stimulus. 1 


Recallers of the Critical Stimulus 


The percentage of recallers of the critical 
stimulus varied over a 23% interval from study 
to study. Where the ANGER associates were used, 
44% of the subjects recalled the critical stimulus 
ANGER (Study 1) whjle 38% recalled the critical 
stimulus CHEESE (Study 2). Where the CHEESE 
associates were employed, 2196 recalled the criti- 
cal stimulus ANGER (Study 3), and 24% recalled 


. the critical stimulus sumer (Study 4). The fore- 


going indicates that the critical stimulus regard- 
less of its content was recalled more readily 
when the more emotionally toned (i.e. ANGER) 
associates were present. 

Recall before the critical stimulus, The associ- 
ative cluster hypothesis leads to the expectation 
that more associates than control words will 
be recalled before the critical stimulus is recalled 
"irrespective of the critical stimulus.) Since the 
cluster(s) imposed by each subject are taken to 
consist of items of comparatively high inter- 
associative strengths, the clusters should be pre- 
potent in strength relative to yprelated control 
words, In line with the foregoing expectation, 
the left-hand portion of Table 1 reveals. that 
in all four studies there is a clear superiority 
for associates to be recalled over control words. 
Furthermore, this superiority in recall is present 

3Tt is clearly conceivable that a critical stimulus 
could be selected such that it would be able to usurp 
the position in recall of all other stimuli in a list. 
Under these conditions the critical stimulus would 
be more prepotent in strength than any clusters 
and different predictions would obviously be 
necessitated. 


regardless of the content of the critical stimulus. 
In fact, recall of associates is relatively better 
in the three studies where the critical stimulus 
is not related to the*associates. Statistical analy- 
ses (two-tailed Wilcoxon tests) indicated that 
the associates were more readily recalled at 
better than the .01 level in each of the four 
studies. 

Recall after the critical stimulus. Although 
clustered associates should be prepotent, those 
associates on which the subject has not imposed 
a clustering should have equivalent strengths to 
unrelated control words, since there is nothing 
contributing to their differential strengths, There- 
fore, no differences in recall should be found 
between associates and control words after the 
clusters have been exhausted or after recall of 
the critical stimulus. It is important to empha- 
size again that the content of the critical stimu- 
lus is largely irrelevant. Rather, when the critical 
stimulus is imbedded in the center of a list it is 
surrounded by competing stimuli, Therefore, 
unless this central stimulus is part of a cluster 
it should appear approximately in the center of 
recall and after the highly clustered associates 
have been recalled. Reference to the right-hand 
portion of Table 1 indicates that even here 
there is a stronger tendency to recall associates 
after the critical stimulus in all four studies. 
However, the differences are not statistically 
reliable in any of the four comparisons (two- 
tailed Wilcoxon tests). 

Early versus late recallers of the critical 
stimulus. Spence found that subjects who re- 
called his critical stimulus cmEEsE early did not 
show a significantly different total recall of as- 
sociates over control words, while those who 
recalled the critical stimulus late demonstrated 
significantly greater recall of associates over con- 
trol words. Further, he found that a comparison 
of the differences between associates and control 
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words for the early and late recallers was also 
significant in favor of the latter group. These 
findings are puzzling from the standpoint of the 
associative cluster proposal, since both early and 
late recallers would be expected to recall more 
total associates than control words. Further, this 
prediction again would hold irrespective of the 
content of the critical stimulus. Table 2 presents 
the findings pertinent to these considerations 
where subjects who recalled the critical stimulus 
in the first half of their recalls are classed in 
the early column, while those who recalled the 
critical stimulus in the second half of their 
recalls are listed in the late column. Table 2 
makes it clear that, contrary to Spence's findings, 
both early and late recallers reproduce more 
associates than control words in all four studies, 
Statistical analyses revealed that with one excep- 
tion the recall of associates was significantly 
superior at better than the .05 level (two-tailed 
Wilcoxon tests). The sole exception was for the 
early recallers in the experiment (Study 4) 
where the critical stimulus was sHIRT and the 
CHEESE associates were presented. Furthermore, 
it may be noted that, again in contrast to 
Spence's findings, three studies (1, 2, and 3) 
showed the early recallers to have a higher recall 
of associates than late recallers. However, none 
of the comparisons of the difference scores (i.e. 
associates minus control words) in any of the 
four studies were significant by either Fisher’s 
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exact probability test or the median test depend- 
ing upon the size of N within cells. In sum then, 
the findings with a single exception favor the 
significantly superior recall of associates over 
control words for both early and late recallers 
of the critical stimulus regardless of the content 
of that stimulus in relation to the associates, 


Nonrecallers | ("Forgetters") of the "Critical 


Stimulus 


Spence (1964) found that nonrecallers of the 
critical stimulus did not differ significantly in 
their recall of associates versus control words, 
These findings are again puzzling since the associ- 
ative cluster hypothesis would predict that more 
associates would be reproduced when compared 
to control words. Table 2 contains the basic data 
for nonrecallers of the critical stimulus in each 
of the four studies. The table clearly reveals ` 
that in each study the nonrecallers reproduced 
more associates than control words, Thé recall 
of associates is significantly superior at better 
than the .001 level in all four studies (two-tailed 
Wilcoxon tests). 

Discussion 


In four separate experiments using associates 
with related and nonrelated critical stimuli, the 
following findings were consistently obtained: 


1. Before a critical stimulus was recalled, sig- 
nificantly more associates than control words 


TABLE 2 


MEAN RECALL OF Assoctates, CONTROL, BUFFER WORDS, AND INTRUSIONS FOR EARLY RECALLERS, 
LATE RECALLERS, AND NONRECALLERS IN FOUR STUDIES 


Study 1 Study 2 
ANGER with ANGER assoclates CHEESE with ANGER associates 
Words Recallers Recallers 
Early Late Non Early Late Non 
(N = 22) (N = 37) (N = 74) (N = 10) (N = 20) (N = 49) 
Associates (10) 3.09 2.46 2.26 3.40 2.45 2.63 
Control (10) 2.23 1.73 1.55 1.80 1.30 1.73 
Buffer (6) 3.55 3.24 3.36 3.30. 4.00 4.00 
Total . (26) 8.87 7.43 7.17 8.50 7.75 8.36 
Intrusions 1.64 1.68 1.49 1.30 1.15 1.24 
Study 3 Study 4 
ANGER with CHEESE associates SHIRT with CHEESE associates 
Ear! Late 
(N E^ (N = 19) (N =6) (N E aam 
Associates (10) 440 3.26 a 2.83 3.35 3.11 
Control (10) 2.00 2.11 2.67 2.15 2.13 
Buffer (6) 4.00 3.68 4.00 3.45 3.65 
Total (26) 11.40 9.05 9.50 8.95 8.89 
Intrusions 1.40 0.89 1.17 1.70 1.56 
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were recalled. This occurred whether or not the 
critical stimulus was related to the associates. 

2. After the critical stimulus was recalled, as- 
sociates showed a consistently greater, albeit 
insignificant trend, to be recalled in comparison 
to control words. 

3. Both early and late recallers of the critical 
stimulug recalled more associates than control 
words and the difference between these types of 
recallers was not significant. 

4. Finally, nonrecallefs of the critical stimulus 

: recalled significantly more associates than control 
words. These findings are consistent with an 
associative cluster hypothesis and are not in 
agreement with the trivariable formulation con- 
sisting of conscious, preconscious, and appar- 
ently unconscious proposed by Spence. Clearly, 
the associative cluster hypothesis is much more 
parsimonious (though not necessarily simpler— 
see belpw) and is able to account for more of 
the data than the former. The four presented 
studies have provided a number of findings that 
are at variance with those that Spence obtained 
but which were predicted by the associative 
cluster hypothesis—namely, those in relation to 
early and late recallers apd those for the non- 
recallers or “forgetters” of the critical stimulus. 


It is worth noting Further that several of our 
results are consistent with those obtained by 
Deese (1959). For example, Deese found that 
. high frequency associates to a stimulus were 

superior in recall to medium frequency associ- 
* ates. This finding parallels the superiority we 
obtained for our associates over the unrelated 
control words. Further, Deese found that there 
were no differences in recall when either a re- 
lated or nonrelated stimulus to associates was at 


i . the head of a list. This is indirectly pertinent 


to our finding that neither the content of a criti- 
eal stimulus nor its recall was important in the 
recall of associates. Rather the significant de- 
terminer of recall was the presence of associates 
versus nonassociate control words which was 
subsumed under the associative cluster hy- 
pothesis. 

Although we have generally described the 
associative cluster position, this position is 
neither novel nor simple. A similar explanatory 
approach has been taken by Bousfield (1953) 
who suggests that the individual imposes a super- 
ordinate construction on items that are clustered. 
In contrast, Deese (1959) proposes that clusters 
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are formed on what appear to be free associa- 
tions among items. The present research has been 
oriented toward demonstrating the utility of a 
general associative cluster hypothesis and not 
with a resolution between different approaches 
attempting to account for the formation of these 
clusters. 

Finally, despite our consistent demonstration 
of the predictive superiority of the associative 
cluster hypothesis, does this mean that an ap- 
proach using the trivariable scheme of “con- 
scious, preconscious, and unconscious” is of no 
value? The problems generated by such a tri- 
variable view have been broached in the many 
studies generally concerned with the role of 
“awareness” in performance. A persistent dif- 
ficulty here has been in an adequate operational 
specification of the term. In the researches pre- 
sented Here, it may be noted that the operations 
used to mirror these three “consciousness” vari- 
ables by Spence are precisely the same as those 
that have shown the utility of the simpler notion 
of associative clustering. This would mean that 
although under some experimental circumstances 
there may be some value to this three-variable 
scheme, the present circumstances are not ade- 
quate for this purpose. To paraphrase Lloyd 
Morgan’s canon, let us be wary of giving more 
complex solutions to problems when simpler 
ones suffice. 
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RESTRICTING EFFECTS OF AWARENESS?: 
SERIAL POSITION BIAS IN SPENCE'S STUDY 


JOHN JUNG 


York University, Toronto 


Group I was a replication of the Spence (1964) study which suggested that 
awareness restricts recall. Spence's list was used with his serial order; it con- 
tained 27 words, cHEEsE and 10 associates, 10 nonassociates of CHEESE, and 
6 buffer items. Both studies suggested more recall of associates before but not 
after recall of CHEESE; however, there was not complete agreement between 
the studies. Groups II and III received the Spence list with markedly different 
serial orders to test the hypothesis that Spence’s results are partly due to the 
serial order used. The list order for Group II contained associates in positions 
more favorable for recall whereas the order for Group III favored the non- 
associates, The results for Group II showed more associates recalled but those 
for Group III showed no difference between word classes in recall. It was 
concluded that serial positión and the greater interitem associative strength 
(Deese, 1962) of the associates in Spence's serial order can account for the 


results which he attributes to restricting effects of awareness on recall. 


Spence (1964) presented evidence suggesting 
that awareness has restricting effects on recall. 
Free recall was measured on a list of 27 words 
consisting of 6 neutral buffer items, 10 words 
known to be associated in varying degrees to the 
word cHEESE, and 10 words matched in fre- 
quency to the set of associates but known to be 
nonassociates themselves, and the word CHEESE. 
The exact list order used by Spence was the 
same for all subjects and is presented later since 
the effects of various serial orders of the Spence 
list on recall will be reported here. 

Spence found more associates were recalled 
before recall of CHEESE; however, after recall of 
CHEESE, no difference between number of associ- 
ates and nonassociates was obtained. Forgetters 
of CHEESE also showed no differential recall of 
the two word classes. It was concluded that 
awareness has restricting effects on recall. Before 
CHEESE is recalled, this key word has not entered 
consciousness and it is able to influence the 
recall of its associates unimpeded. However, 
after CHEESE is recalled and the subject becomes 
aware of its presence associative diffusion be- 
comes restricted by the active search for associ- 
ates by the subject. As a result, recall of associ- 
ates fails to exceed that of nonassociates after 
CHEESE is recalled. In the case of forgetters 
Spence asserts that CHEESE has a different effect 
on recall than in subjects who eventually recall 
it. Where CHEESE is forgotten, Spence maintains, 
either it failed to register initially or some dy- 
namic defensive processes prevent it from enter- 
ing awareness; thus, forgetters also show no dif- 
ference in recall of the two classes of words. 


Additional analyses of the temporal point of | 
recall of cHEESE were presented by Spence as | 
support. If the subject recalls CHEESE early no | 
difference in recall of the two word classes will - 
result due to the longer conscious influence of | 
CHEESE; but if CHEESE occurs late in the sub- 
ject's recall it has had more time for unconscious 
influence and thus more associates than non- 
associates should be recalled. One might argue, — 
on this basis, that forgetters who are "infinitely ` 
late" recallers of CHEESE should also recall more 
associates; however, Spence does not entertain | 
this implication. He did report that late recallers —' 
of CHEESE gave more associates than nonassoci- 
ates; on the other hand, early recallers (and 
forgetters) showed no difference in the number 
of words recalled of each type. 

Although Spence replicated this study success- 
fully, only one serial order of his word list was 
employed. It is a well-established fact that free 
recall is a function of serial position (Deese & 
Kaufman, 1957) and Spence's failure to vary his 
serial order may have influenced his results. 
Close inspection of the Spence serial order (see 
Method) shows that 6 of the 10 outermost items 
(not including buffers which, incidentally, could 
be counted as nonassociates) in the list are 
associates; on the other hand, only 4 of the 10 
innermost items (not including cHEESE) are 
associates. 

Deese and Kaufman demonstrated with lists 
of equal length to that used by Spence that 
recall is highest for items at the end of the list, 
followed by those at the beginning, with poorest ; 


recall for items in the middle. On the basis of ~ 
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this effect, it should be apparent that in Spence's 

serial order, associates occupy 6 of 10 of the 

more favorable positions (not including buffers). 

This factor provides some bias in favor of higher 
, recall of associates. 

However, Spence found under some conditions 
that there were either ng differences or superior 
recall ôf the nonassociates. For example, after 
recall of CHEESE there was no difference between 
number of associates and nonassociates recalled. 
In addition, early recallers failed to show su- 
perior recall of associates. 

Both exceptions to the prediction of higher 
recall of associates due to a bias in Spence's 
serial order can be handled by one additional 
fact of recall. Deese and Kaufman not only 
found probability of recall to be related to 
serial position in the manner described earlier 
but they also reported that the temporal order 
of recall was similarly related to serial position. 
Items at the end of the list were recalled first, 
those at the beginning next, and finally, those in 
the middle. 

This effect implies that a typical subject 
would tend to recall cHEEsE relatively late (if 
at all) since CHEESE was the fifteenth of 27 
words in Spence's list order. Before the recall of 
CHEEsE he should be recalling items at the ends 
of the list of which 60% are associates; after 
the recall of CHEESE, he should be recalling items 

. from the center of the list where only 40% of 
the items are associates, In addition probability 
of recall is higher for the ends than for the 
middle of the list. 

The failure of higher recall of associates after 
recall of CHEESE can be attributed to the fact 
that only 40% of the words in the middle of 
the list are associates rather than to the assump- 
tion that awareness restricts recall, Similarly, 
higher recall of associates before CHEESE is re- 
called may be due to the fact that 60% of the 
end items are associates, not to the greater 
associative diffusion when the subject is unaware 
of CHEESE. A 

Why do early recallers, then, fail to show 
more associates recalled whereas late recallers 
do? First, it should be pointed out that early 
recallers of middle items of the list such as 
CHEESE are not typical of subjects in gen 
according to Deese and Kaufman’s results. Since 
CHEEsE is in the center of the list, it should be 
recalled relatively late by the typical subject, if 
at all. Early recallers, for whatever reasons they 
start at the center of the list, are in the region 
containing. only 40% of the associates. This 
factor rather than restrictive effects of awareness 
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can account for failure of early recallers to give 
more associates. 

One flaw in our argument may be apparent: 
why does not recall of nonassociates ever exceed 
that of associates? Especially in the two instances 
cited above of recall after CHEESE is recalled 
and of early recallers, it could be maintained that 
since only 40% of the items in the middle of 
the list are associates, recall in those two casés 
which primarily involve the middle of the list 
should show more nonassociates recalled. The 
failure of nonassociates to produce higher recall 
even when more frequent may be due to the 
fact that the 10 associates interact on the recall 
of each other. Although there are 10 associates 
and 10 nonassociates in the list, one may argue 
that there is one class of 10 associates and 10 
classes of nonassociates each with one item. The 
actual presence of the key word CHEESE in the 
list may be superfluous to the ability of the 10 
associates to influence the recall of each other. 
Spence maintained that since the interitem 
associative strength of the set of 10 associates 
to each other was virtually 0, the 10 associates 
could not influence the recall of each other. 
However, Spence's measure of interitem asso- 
ciative strength was based on the tendency 
of the 10 associates to elicit one another'as 
associates on a word-association test. Although 
Deese (1959) devised this method, he later 
(Deese, 1962) considered it to be inferior to a 
consideration of the degree of overlap of associa- 
tions to each of the words of a set. For example, 
if neither PIANO nor SYMPHONY elicit each other 
on a word-association test this does not mean 
they are not interassociated since both may elicit 
a common associate such as MUSIC. 

The present study varies the serial order of 
the list used by Spence to test the generality of 
his finding that awareness restricts recall. 


METHOD 
Subjects 


Students in three introductory psychology classes 
provided a total of 95 subjects, 30 for Group I, 35 
for Group II, and 30 for Group III. 


Experimenters 


The three instructors of the three classes, none of 
whom knew the purpose of the study, served as 
experimenters for their respective classes. 


Design 

Group I received the list in the same serial order 
used by Spence in order to replicate the Spence find- 
ings. His order was: street, bird, dime, flag, cow, 
arch, BREAD, CAVE, chair, MOON, best, COTTAGE, gift, 
frail, CHEESE, GREEN, sand, BRICK, hedge, SMELL, 
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MOUSE, mirror, sour, trunk, chin, number, joke 
(CHEESE associates are in capital letters). Frequency 
of associates and nonassociates were matched. 

'The serial order for Group II was such that the 
10 associates were placed in the first and last 5 posi- 
tions within the buffer items. The 10 nonassociates 
with CHEESE in the center occupied the middle 11 
serial positions. It is assumed that if serial position 
influences recall as described earlier, this list order 
should enhance the superior recall of associates 
found by Spence. 

In Group III, the list was arranged such that the 
first and last 5 positions within the buffer items 
were occupied by the nonassociates and the middle 
11 positions by the 10 associates plus CHEESE. Under 
this order, no superior recall of associates is pre- 
dicted on the basis of the serial position hypothesis. 
In fact, one might expect nonassociates which oc- 
cupy the more favorable positions for recall to show 
higher recall than the associates. However, it should 
be noted that the 10 associates and CHEESE may not 
suffer in recall as much as one might expect merely 
on the basis of their serial position. Being placed 
together in a block may enhance the action of inter- 
item associative tendencies for the 10 associates and 
CHEESE (interitem association strength being based 
on Deese, 1962, rather than Deese, 1959, which 
Spence, 1964, used). 


Procedure 


Instructions were based on those described by 
Spence (1964). Each list order was administered to 
a different class which was tested as a group. They 
were informed that a list of 27 words would be read 
to them at a rate of 1 per second. They were in- 
structed to listen carefully since they would be 
tested for recall afterwards. After the list was read, 
they were asked to write down as many of the 
words as they could recall in any order (Spence's, 
and ours, instructions did not allow the subject to 
know recall could be in any order until after the 
list was read). Guessing was encouraged and no 
time limit was imposed. 


RESULTS AND DISCUSSION 
Total Recall 


A comparison was made of the mean number 
of associates and nonassociates in the total recall 
for each condition. Recall of associates was 
higher than that of nonassociates for all condi- 
tions, the mean differences being 0.9 for Group 
I, 1.3 for Group II, and 0.3 for Group III. Using 
Wilcoxon tests, these differences within each 
condition were significant in Groups I and II (5 
<.01 in each case) but not in Group III. 

Next, Mann-Whitney tests were made of the 
differences among the three groups on the types 
of words recalled. The superior recall of asso- 
ciates found in both Groups I and II was sig- 
nificantly greater than that found in Group III 
(p < .02 between Groups I and III; p= .01 be- 
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tween Groups II and III). No significant differ- 
ences were obtained between Groups I and II, 

Thus, list serial order affects the relative re- 
call of associates and nonassociates in Spence's 
list. In Group I which used Spence's serial order, 
significantly more associates were recalled; how- 
ever, that order was favorable toward recall of 
associates. The list order in Group II was even 
more favorable for recall of associates since 
those items occupied the 10 outermost positions 
of the list proper; the’ results showed signifi- 
cantly more associates recalled with this serial 
order. Finally, in Group III which received a 
serial order in which the 10 more favorable posi- 
tions for recall were assigned to nonassociates, 
there was no difference in recall between associ- 
ates and nonassociates. 

The present results support the serial position 
hypothesis as an alternative explanation for the 
results obtained by Spence. Bias in favor. of as- 
sociates being recalled appears plausible as an 
explanation for the results obtained with the 
Spence serial order. However, the fact that the 
nonassociates, although occupying the more fa- 
vorable positions for recall in the serial order 
for Group III, failed to be recalled better than 
the associates suggests that the serial position 
hypothesis is incomplete. An explanation offered 
earlier that the interitem associative strength of 
the associates exceeds that of the nonassociates 
may account for the equal recall of the two 
classes of words in Group III. Although non- 
associates are favored by serial position, recall 
of associates is aided by greater interitem asso- 
ciative strength, especially since they are sand- 
wiched together with CHEESE. In summary, 
Spence's finding of better recall of associates may 
be due to the more favorable serial positions of 
these items in Spence's order and to their 
greater interitem associative strength, 


Recall before and after cuggsg 


Spence's main analyses were not based on total 
recall as above. One of his comparisons was 
based on recall before and after recall of CHEESE. 
His data and that of the present study are pre- 
sented for comparison in Table 1 showing the 
mean number of correct recalls for each class of 
words. 

A comparison of Spence's two groups with 
Group I of the present study shows that before 
CHEESE is recalled more associates are recalled 
but after -CHEESE is recalled little or no differ- 
ence in recall of the two word classes is found. 
However, although the direction of the differ- 
ences agree, the higher recall of associates before 
CHEESE was not significant in our study. 
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TABLE 1 
MEAN RECALL OF THREE CLASSES OF WORDS BEFORE AND AFTER RECALL OF CHEESE 


Spence Jung 
Class Group I Group II Group I Group II Group III 
$ Before | “After | Before | After | Before | After | Before | After | Before | After 
Associates 24 2:519 2.5 1.8 1.9 1.4 1.5 19 2.0 2.0 
Control 12a 2.4 1.6 2.1 1.0 1.4 9 12 22 15 
Buffer 241 1.7 2.0 1.0 1.4 1.2 2.0 9 2.6 1.0 


In Group II which received an order with 
associates in favorable positions for recall, 
more associates than nonassociates were recalled 
both before and after recall of CHEESE; how- 
ever, in neither case was the Wilcoxon test 


. significant. 


Finally, in Group III which received an order 
in which nonassociates occupied positions more 
favorable for recall, slightly more nonassociates 
were recalled before cHEESE and more associ- 
ates were recalled after CHEESE. However, neither 
difference was significant by a Wilcoxon test. 

Although none of ethe above comparisons 
yielded significant differences, it should be noted 
that the before an@ after CHEESE comparisons in 
Group I which received the Spence list order 
were very similar to Spence's results. On the 
other hand, these comparisons in Groups II and 
III receiving markedly different list orders pro- 


duced quite different results which appear con- 
sistent with the nature of those different list 
orders. 


Type of Recaller 


Spence compared recall in early and late re- 
callers and forgetters of cHEEsE. Recallers of 
CHEESE who included it in the first half of their 
total recall were considered as early and those 
who included it in the last half of their total 
recall were labeled as late. Table 2 presents the 
recall of the different word classes as a func- 
tion of type of subject for the three groups of 
the present study as well as that of Spence's two 
groups. 

Spence found late subjects gave more associ- 
ates but that early subjects and forgetters showed 
no difference in recall of associates and non- 
associates. The present study, however, finds 


TABLE 2 
Mean Recatt or CHEESE ASSOCIATES, NoNASSOCIATES, BUFFERS, AND INTRUSIONS 


Spence 
Words Group I (N = 34) Group II (N = 36) 
Earl, Li Fi 
ay | aum | Tate | w-» uw fm | avis) 

Associates (N = 10) 42 54 3.6 3.2 5.0 3.2 

Nonassociates (N = 10) 44. 2.7 2.9 42 34 29 

Buffer (N — 6) 3.9 3.7 3.3 3.6 27, 2.6 

Intrusions 0.9 14 2.3 15 1.6 22 
Jung 


Group I (N = 30) 


Group II (N = 35) Group III (N = 30) 


Li Fe Earl: IL F 
aby | al FREES] a do | rein [OME] ere» |o 250 (e i1) 
29 43 44 3.5 33 
Associates (N = 10 3.1 3.9 3.5 3.9 i 3 n i f 
Nonassociates (V 9 10) 23 25 1.8 24 2.0 2.3 4.0 3.4 3.1 
Buffer (N = 6) 2.9 2.1 2.3 3.0 3.0 2.4 3.3 3.8 2.9 
Intrusions 5.0 5.7 4.4 1.5 14 2.8 1.8 13 34 


128 


that all types of subjects in all conditions tend 
to recall more associates than nonassociates. 
Regardless of whether the subject recalls CHEESE 
early or late or not at all, he tends to give more 
associates in recall. This superior recall of as- 
sociates occurs regardless of the list order also. 

Differences in favor of associates were sub- 
stantially higher in Groups I and II which had 
list orders favoring associates than in Group III 
which had a list order favoring recall of non- 
associates. However, all except two of the com- 
parisons failed to produce a significant Wilcoxon 
test. Somewhat surprisingly perhaps, forgetters of 
CHEESE in both Groups I and II which had list 
orders favoring associates in recall were the only 
classes of subjects who gave significantly more 
associates in recall, p < .01 in both cases. Spence, 
it will be noticed, found that forgetters showed 
no differences in recall; yet, in the three tondi- 
tions of the present study as well as the two of 
Spence there is numerically higher recall of as- 
sociates by forgetters. 

In summary, analyses of the data in several 
different ways consistently show higher recall of 
associates with few exceptions. This superiority 
of recall for associates tends to be greatest for 
Group II which received the list in the order 
most favorable for recall of associates; smallest 
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differences in favor of recall of associates are 
found in Group III which received the list in an 
order with nonassociates in positions more fa- 
vorable for recall. When total recall is consid- 
ered, the higher recall of associates is significant 
in Groups I and II but not in III; in the analyses 
of recall before and after cHEESE and as a func- 
tion of type of recaller, most of the differences 
were not significant but consistent with our ex- 
planation of Spence's results based on serial posi- 
tion effects and interitem' associative strength 


rather than Spence's notion of restricting effects , 


of awareness. 
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AN UNSUCCESSFUL ATTEMPT TO REPLICATE SPENCE’S 
EXPERIMENT ON THE RESTRICTING EFFECTS 
OF AWARENESS 


IRIS BRUEL, STANLEY GINSBERG, MARY LUKOMNIK, 
anp GERTRUDE R. SCHMEIDLER 


City College of New York 


An E unfamiliar with the purpose of the experiment read to 60 Ss a word list 
including CHEESE, associates to CHEESE, and nonassociates. Results showed 
fewer Ss who recalled cmeEsE than Spence had found (p=.001) and only a 
slight, nonsignificant trend in the direction of Spence’s other findings. It was 
then learned that Spence’s E had known the purpose of the experiment. A 
supplementary series with one of us as E gave a proportion of CHEESE recallers 
significantly different from ours but not from Spence’s. Conclusions are: (a) 
the naivete of E was a meaningul variable; (b) the question is still open as 
to whether subliminal stimuli are functionally equivalent to words not re- 


called but accessible to awareness. 


Spence and Holland (1962) and Spence (1964) 
attempted to validate the Freudian concept that 
preconscious thinking follows associative path- 
ways. They hypothesized that a preconscious 


stimulus should arouse more associations than a 
conscious one, and that the patterns of recall 


produced by preconscious and iren stimuli 
should be different. 
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Spence and Holland flashed the word CHEESE 
subliminally. An experimenter naive as to the 
purpose of the experiment then read aloud to 
the subjects a list of words which included as- 
sociates to CHEESE and nonassociative control 
words, Subjects were asked to write the words 
they could recall. They recalled more associates 
to cheese than did subjects to whom CHEESE 
was presented supraliminally, or than subjects to 
whom the word cHEESE was not presented at all. 

In a later experifnent, Spence included the 
critical word CHEESE midway in a list of orally 
presented associates and control (unrelated) 
words and asked his subjects to write as many of 
the words as they could recall. Words recalled 
before cHEEsE included a significantly greater 
number of associates than control words. After 
CHEESE was recalled, as many control words 
were recalled as were associates, When CHEESE 
was ,forgotten, the subjects recalled slightly but 
not significantly more associates than control 
words, but they recalled fewer associates than 
the group of CHEESE recallers. 

Because of the similarity of the results of 
these two experiments, Spence (1964) suggests 
that “a word not actuglly in awareness but ac- 
cessible to recall is, in many ways, equivalent to 
a subliminal stimulus [p. 98].” He further sug- 
gests that subjects who forget CHEESE may do so 
because of its unpleasant connotation, basing 
this suggestion on the fact that a significant 
proportion of forgetters recalled the word SMELL. 

We attempted to replicate these provocative 
results, 


METHOD 


Sixty students in an introductory psychology class 
were seated in alternate rows and seats of an audi- 
torium. They were told upon entering the room, 
“This is not a test,” and casual, spontaneous re- 
marks were added to maintain an informal atmo- 
sphere until the instructions were presented. 

All subjects were asked to mark three sheets of 
paper with any code symbol that they chose for 
identification. A naive experimenter, who had been 
trained to read short words ate the rate of 1 per 
second, was given Spence’s word list, with all words 
typewritten in capitals, She read the words without 
any phrasing that was noticeable to observers. When 
she had completed the list, subjects wrote on the 
first of their sheets of paper all the words that they 
could recall, When they had completed their recall 
they put their lists in containers placed in the aisles. 
The experimenter then said: 


In the list of words read to you there was one 
word which ties in with many words on the list. 
Some of the words were all associated with some- 
thing familiar to you. Can you guess which word 
it was that had the others associated with it? 
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TABLE 1 


AVERAGE RECALL or THREE CLASSES OF WORDS 
BEFORE AND AFTER RECALL or CHEESE 


Words Before After 

Associates 1.05 1.89 

Controls .68 1.63 
Difference between associates 

and controls XY 26 

Buffers 3.00 .89 


Write down your guess on the next sheet of paper. 
If you have another guess, put it in parentheses. 


These sheets were then put in the containers. Sub- 
jects were told that the critical word was CHEESE, 
and were asked to write on their third sheet "whether 
or not, you like cheese." These papers were then put 
into the containers. 


e 
RESULTS 


Nineteen of the 60 subjects recalled the word 
CHEESE. This is a significantly smaller propor- 
tion than the 48 out of 70 who recalled CHEESE 
in Spence’s experiment (x? — 17.60, p< .001). 

Spence found in two groups of subjects aver- 
ages of .96 and .90 more associates recalled than 
control words. For the combined results, the 
difference between associates and controls was 
significant at the .001 level (two-tailed Wilcoxon 
test), Our results (Table 1) are that an average 
of 0.37 more associates were recalled than con- 
trol words. The difference as evaluated by Wil- 
coxon's T is 26.0, which is clearly insignificant. 

Spence also subtracted control words from 
associates. The difference was greater before 
CHEESE was recalled than after (.8 in one group 
and 1.2 in the other). Our results (Table 1) 
show an average difference of 0.11 in the same 
direction, which of course is not significant (Wil- 
coxon’s T = 49.5). 

Spence found the average difference. for asso- 
ciates minus control words (2.6) to be signifi- 
cantly greater for late recallers (those who re- 
called CHEESE in the latter half of their word 
lists) than for early recallers (those who recalled 
CHEESE in the first half or at the middle of their 
lists). Our results show an average difference of 
0.4 in favor of the late recallers (Table 2) which 
is insignificant (Mann-Whitney U test, p = .36). 

Spence suggested that subjects forget CHEESE 
because of its unpleasant connotations related to 
smell, as evidenced by the large proportion of 
forgetters who recalled SMELL. Of our 41 CHEESE 
forgetters, only 7 recalled sMELL; and of these 
7, 5 wrote that they liked cheese, 1 wrote that 
he did not, and 1 gave an equivocal answer. Al- 
though these figures do not seem consistent with 
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TABLE 2 


AVERAGE RECALL or CHEESE AssoctATES, CONTROL, 
BUFFER WORDS, AND INTRUSIONS 


Early Late Forgetters 
recall = 

Words (N = 11) (N —8) (lie) 
Associates (10) 3.36 
Control (10) 2.91 
Buffer S 3.82 
Total (26 10.09 
Intrusions 2.36 


Spence's suggestion, we found others which tend 
to support it. Of the subjects who reported on 
their final sheet of paper that they liked cheese, 
37.5% recalled CHEESE, but only 17.6% of those 
who reported that they disliked cheese recalled 
the word. The difference between groups is 
marginally suggestive (p = .10, by Fisher's exact 
method). 

Only 1 of the 60 subjects guessed that CHEESE 
was the tie-in word with which many others 
were associated. 


Discussion AND SUPPLEMENTARY SERIES 


Our results tended to follow in the direction 
of Spence's, but failed to attain significance. 
After examining them we discussed them with 
Spence, who told us that he had been the one 
to read the word list to his subjects in the 1964 
experiment, It seemed possible that our use of a 
naive experimenter may account for the fact that 
fewer of our subjects recalled the word CHEESE, 
and to check on this possibility we performed a 
supplementary experiment in which one of us 
(IB) read the word list to four classes in intro- 
ductory psychology. 

The procedure was like that of our previous 
experiment except that instead of insisting that 
students sit in alternate seats and rows, the 
classes were asked to arrange their chairs so that 
no one would inadvertently have his eye fall on 
what anyone else had written; for the first and 
fourth class, IB attempted to emphasize the 
word CHEESE by increasing her voice volume 
slightly and looking directly at the class as she 
read it, and also by counting an extra second 
both before and after the word; and for the sec- 
ond and third classes IB tried to deemphasize 
the word CHEESE by speaking it (and also the 
CHEESE associates on the list) a little less clearly 
and with a little less volume than the other 
words and also by not looking at the class while 
she read those words. 


BRIEF ARTICLES 


For the two classes where CHEESE was delib- 
erately emphasized, 26 out of 43 subjects re- 
called CHEESE; for the two classes where CHEESE 
was deliberately deemphasized, 21 out of 43 sub- 
jects recalled cHEEsE. The difference is not sig- 
nificant. When the results are compared with 
the proportion of CHEESE recallers in our initial 
series, the difference is marginally significant for 
the deemphasis subjects (x? = 3.89, p = .05) and 
significant for the emphasis subjects (x? = 8.44, 
p=.01). Comparison witii the proportion of 
CHEESE recallers in Spence’s experiment shows a 
difference which does not attain significance for 
either the deemphasis subjects (x?= 3.31, p= 
.10) or for the emphasis subjects (x? — .77). 
The data thus imply that even an experimenter 
who is deliberately trying not to emphasize a 
word known to be crucia}, may somehow make 
that word more memorable than would a naive 
experimenter. Perhaps the most interesting datum 
is the absence of a significant difference in the 
proportion of CHEESE recallers when CHEESE 
was emphasized and when it was deliberately 
deemphasized. 

The trend of the other data was not markedly 
different from that of our previous series, except 
that 56% of the subjects who said they liked 
cheese recalled the word, and 53% of the sub- 
jects who said they disliked cheese recalled the 
word. The difference is obviously insignificant. 

A theoretical issue is raised by consideration 
of our experimental findings in conjunction with 
those of Spence and Holland, where a naive ex- 
perimenter read the word list to the subjects. 
The major procedural difference between their 
experiment and ours seems to be that in theirs 
the stimulus word cHEESE was exposed sublimi- 
nally. The difference between their results and 
ours implies a functional difference between a 
word presented subliminally and a word clearly 
heard. In comparing the 1962 and 1964 experi- 
ments, Spence concluded that there is functional 
equivalence between a subliminal stimulus and 
“a word not actually in awareness but accessible 
to recall.” We suggest that the question is still 
open. 
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HOW RESTRICTED ARE THE RESTRICTING EFFECTS?: 


A REPLY 


DONALD P. SPENCE 
Research Center for Mental Health, New York University 


The restricting effects of awareness was shown 
in two ways in the original stutly (Spence, 1964). 
Early recall of the stimulus was found to inhibit 

,recall of the associates because, it was argued, 
the conscious word brings into play the restrict- 
ing structures of waking thought. Late recall of 
the stimulus appeared to enhance recall of associ- 
ates because, it was argued, a preconscious word 
should have a widespread pattern of activation. 
The hypothesis clearly depends on the interac- 
tion between recall of the stimulus word and 
relative recall of the associates and can only 
be tésted when the stimulus and associates are 
related. Therefore it would appear that only 
those studies in which the associates and stimu- 
lus word are related would be relevant. 

In the paper by Worell and Worell (1965), 
only Study 1 is pertinent to the original finding 
because only in that stüdy is there any relation 
between the criticgl word and the associates. 
The authors show in Table 1 that relative recall 
of associates is greater before than after recall 
of the stimulus; although the effect is not sig- 
nificant, the trend supports the original study. 
They also show that relative recall is greater 
before the stimulus is recalled even when the 
stimulus is unrelated (Studies 2, 3, and 4) and 
offer an explanation based on associative cluster- 
ing. They imply that the recall of the stimulus in 
some way signals the end of an active clustering 
process, thereby reducing the relative recall of 

.associates; they do not say why such a signal 

* should occur. It is clear that in Studies 2, 3, and 
4, the change in recall cannot be attributed to the 
restricting effects of awareness, because stimulus 
and associates are unrelated. Their proposed 
clustering principle may offer qn explanation. It 
is not clear how the clustering principle, how- 
ever, can explain the restricted recall of associ- 
ates in the original study or Study 1, after the 
stimulus word was recalled. It therefore does 
not follow that the clustering principle, because 
it works partially in Studies 2, 3, and 4, is the 
preferred explanation in Study 1 where it ex- 
plains only part of the findings. : 

More compelling evidence against the restrict- 
ing effects hypothesis is presented in Table 2, 
Study 1, where the data show that early recallers 
of ANGER show about the same relative recall as 


late recallers, Further study is needed to deter- 
mine whether a threatening word, such as 
ANGER, introduces an important variable that 
must also be taken into account. 

The paper by Jung (1965) argues that the 
main finding in the original study may be a 
function of the serial position of the items; that 
because the associates tend to occupy outermost 
positions on the original list, they should be re- 
called early; and that CHEESE, because it came 
in the middle, should be recalled after the outer- 
most items (the argument is based on a study 
by Deese & Kaufman, 1957). Data in the orig- 
inal study cast doubt on the first two assump- 
tions (see Tables 3 and 4 in the original paper). 
Recall of associates at the outermost positions 
tended to be lower than recall of associates in 
the middle of the list—the opposite of the hy- 
pothesis. Secondly, CHEESE, a middle item, was 
not recalled late, on the average, but early; its 
modal serial position was 6. Turning to the new 
findings, there is again cause for doubt, Jung's 
position would imply that if serial position ac- 
counted for the results, a list that maximized 
primacy and recency would produce the strong- 
est effect, but this is not the case. The list used 
in Group II which had the 10 associates placed 
in the first 5 and last 5 positions within the 
buffer items showed less evidence of the restrict- 
ing effects of awareness than the original CHEESE 
list, for with Jung's list, relatively more asso- 
ciates were recalled both before and after 
CHEESE. 

Jung’s findings in Study 2 make it clear that 
recall of associates can be favored by clumping 
them at the ends of the list, but this change does 
not account for the interaction between recall of 
the stimulus word and recall of associates found 
in the original study. Once again, the alternative 
explanation does not confront the critical ques- 
tion, and to that extent, it does not seem pref- 
erable to the original formulation. 

The first study by Bruel, Ginsberg, Lukomnik, 
and Schmeidler (1965) showed a slight and non- 
significant tendency supporting the original find- 
ing. On the assumption that CHEESE was perhaps 
inadvertantly emphasized in the original study, 
they ran a second experiment in which CHEESE 
was deliberately emphasized. The findings in 
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this group (not reported here but provided 
privately by Schmeidler) made it clear that 


stimulus emphasis does not heighten the initial. 


effect; on the contrary, it seems to work against 
it. When CHEESE was emphasized, the relative 
recall of associates was only slightly better before 
CHEESE was recalled, but significantly better 
after (p<.05, two-tailed test), reversing the 
original finding. It would seem that differences 
in emphasis do not account for the differences 
between the original study and the Bruel et al. 
replication; in fact, the new findings show that 
when the stimulus is clearly emphasized, the re- 
stricting effect disappears. This finding goes 
against the results reported by Deese (1959) 
and those found in the supraliminal condition in 
Spence and Holland (1962). It would appear 
that spontaneous awareness of the stimulus 
word, as it emerges in the course of recell, has 
different consequences than 'awareness which is 
partly the consequence of experimenter empha- 
sis, but further study is needed to explore this 
distinction, 

Two of the papers make it clear that replica- 
tion of the original finding is not automatic: to 
that extent, the conditions under which the phe- 
nomenon will appear are still somewhat in doubt. 
The question of explanation is more clearly de- 
fined; none of the alternatives suggested is clearly 
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satisfactory and it may be that one based on 
preconscious influences is the most parsimonious 
in light of current findings. 
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ANTI-SEMITISM, STRESS, AND JUDGMENTS OF STRANGERS 


BRENDAN GAIL RULE 
University of Alberta 


"This study investigated the judgments of high-, medium-, and low-prejudiced 
Ss under stress conditions. Extreme and moderate groups on the Levinson 
Anti-Semitism scale, subsequent to a difficult anagram task, were asked to 
describe 2 strangers. High- and low-prejudiced Ss reported greater personality 
differences between the strangers and were more negative,in their evaluations 
than moderately prejudiced people. Extremely anti-Semitic Ss expressed more 
negative feelings toward the 1st person described; the low-prejudiced Ss were 
more negative toward the 2nd person described. 


Berkowitz (1959, 1960) has reported that 
prejudiced individuals manifest different judg- 
mental processes than their less prejudiced peers. 
His studies indicated that under stress highly 
prejudiced individuals use broader categorizations 
thereby making grosser discriminations between 
stimuli. Low-prejudiced individuals react to stress 
with a tendency to make finer discriminations 


between stimuli. On the other hand, Rokeach 
(1960) proposes that individuals who adhere to 
extreme points of view tend to espouse their 
views in the same way although the content of 
their attitudes may differ. Thus he argues that 
the structure of extremists’ conceptual systems 
is different from those who are more moderate 
in the expression of their views. Applying this 
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reasoning to the dimension of prejudice, it might 
be expected that extreme scorers on the Anti- 
Semitism scale would- manifest similar rather 
than different judgmental processes. The judg- 
ments of the extremely high- and low-prejudiced 


- persons should differ from moderately prejudiced 


individuals’ judgments. 

This study attempted to evaluate these two 
positions, Following, a stressful situation, indi- 
viduals classified as high, medium, and low anti- 
Semitic subjects were asked to describe two 
people, present in the experimental room and 
‘unknown to them prior to the experiment. 
According to Berkowitz’ rationale, highly preju- 
diced individuals, using gross categories under 
stress, should report fewer personality differ- 
ences between two strangers, whereas the low- 
prejudiced individuals, using finer categories 
under stress, should report greater personality 
differences between the strangers. Predicting from 
Rokeach’s notion, both the high- and low- 
prejudiced individuals should use similar modes 
of categorizing and should not differ with respect 
to differentiating between people. The moder- 
ately prejudiced individuals should exhibit dif- 
ferent judgmental processes from the other two 
groups. - 

In addition to the study of judgmental proc- 
esses per se, the type of attitudes expressed 
toward strangers is of experimental interest. It 
has been found that highly prejudiced indi- 
viduals generalize negative feeling aroused by a 
frustrating person to a neutral bystander, 
whereas the low-prejudiced individuals become 
more friendly to the neutral person following 
frustration (Berkowitz, 1961). In this study, the 
frustration was independent of the two persons 
being described. Degree of negative feeling 
toward the two neutral strangers was compared 
for the high-, medium-, and low-prejudiced 


* *individuals, 


METHOD 
Subjects 


. 

Introductory psychology students numbering 950 
completed a questionnaire which included the 17- 
item Anti-Semitism scale and the 28-item F Scale 
(Adorno, Frenkel-Brunswik, Levinson, & Sanford, 
1950). For both scales, item scores varied from 4 
points for a response indicating strong agreement 
with an item to O for a response indicating strong 
disagreement. The result distributions were: A-S 
scale, range = 0-29, median —12; F Scale, range 
— 13-89, median — 50. From this sample 27 subjects 
were selected for the study. Experimental groups 
consisted of 9 subjects each of high-, medium-, and 
low-prejudiced individuals. The median scores for 
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the groups were: (high) A-S — 22, F — 64; (medium) 
AS = 13, F = 53; (low) A-S =4, F — 39. 


Procedure 


Subjects were run in nine groups of three subjects 
each. The groups were homogeneous with respect to 
sex, anti-Semitism, and authoritarianism scores. Sub- 
jects were greeted’ and briefly introduced to each 
other. Each subject stated his or her name. No more 
communication was permitted. While the subjects 
were in view of each other, they were told that 
there were two parts to the study. The first task was 
to work on problems; the second was to provide 
information on first impressions of people. Then the 
subjects were seated in cubicles which restricted their 
range of observation. 

Subjects were given a list of anagrams to solve 
within 15 minutes. The anagrams corresponded to 
the problems used by Sarason (1961). Subjects were 
instructed that performance on the task was directly 
related to intelligence? and that high-school students 
of above average intelligence (IQ greater than 100) 
were able to successfully complete the task, These 
instructions typically induce anxiety. (Berkowitz, 
1959; Sarason, 1961). r 

Subjects were then given two rating sheets consist- 
ing of a list of 13 bipolar adjectives. Subjects were 
asked to make successive judgments of the two other 
people in the room to whom they had been intro- 
duced. It was intended that these ratings would 
depend only on cues and characteristics perceived 
during the subjects brief introduction to one another. 


RESULTS 


To assess any possible differences between 
experimental groups in ability to solve anagrams, 
the mean number of anagrams correctly solved 
was examined, Results from an analysis of vari- 
ance provided no evidence that high-, medium-, 
and low-prejudiced subjects differed in the num- 
ber of correct solutions within 15 minutes time 
(F = 1.81, df = 2/24). 

Evaluations given by high, medium, and low 
anti-Semitic subjects were examined. Data con- 
sisted of total scores over the 13 adjective scales 
for each of the two people described, with low 
scores indicating a favorable evaluation and high 
scores indicating an unfavorable evaluation, 
Using analysis of variance, three factors were 
assigned between subjects and two factors 
within subjects. These factors were three levels 
of prejudice and two strangers described. 

The high-, medium-, and low-prejudiced sub- 
jects differed in their evaluation of strangers 
(F =6.77, df —2/24, p < .001). Table 1 indi- 
cates that the high and low anti-Semitic subjects 
both exhibited more negative evaluations of 
strangers than the moderately prejudiced sub- 
jects. No differences were found between the 
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TABLE 1 
Mean EVALUATION SCORES 

Level : 

panties Total 
High - | 2.92 
Medium 2,44 
Low 2:92 


ratings of the fid and second stranger (F — 90, 
df = 2/24). The interaction between prejudice 
and the ratings of the two strangers was .sig- 
nificant’ (F= 4.68, df—2/24, p «.05). The 
highly anti-Semitic "subjects expressed more nega- 
tive evaluations of the first person described 


whereas the low anti-Semitic subjects expressed . 


more negative evaluations of the second person 
described. Differences, between the mean 'evalua- 
tions of the first and ‘second person indicated 
that the high- and low-prejudiced individuals 
judged greater differences between the strangers 
than did the moderately: prejudiced. Table 1 
contains the mean “evaluation, scores between 
group. ^. — . 


Discussion 


E 


Data indicated that extremely “high and ex- 
tremely low prejudiced individuals react differ- 
ently, under stress than moderately prejudiced 
individuals, High “and low scorers on the Anti- 
Semitism scale tended to report greater person- 
ality « differences, between two strangers and were 
more negative in their evaluations of strangers 
than moderately. prejudiced subjects. Negative 
feelings towards strangers are aroused for high- 
‘and low-prejudiced scorers even when the stress 
results from not solving anagrams. 

These findings lend support to Rokeach's 
notion that there are similarities in the behavior 
of extremists. On the surface, it appears that 
Berkowitz’ data indicating different judgmental 
processes for high- and low-prejudiced subjects 
were not supported. Comparison of Berkowitz’ 
subject selection criteria with that used in the 
present study. suggests reasons for the discrep- 
ancy in findings, Berkowitz’ group of low scorers 
were selected from approximately the lower third 
of the A-S distribution with scores of 12 or lower, 
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and were below the median on the F Scale, The 
low-prejudiced subjects in the study reported 
here had a median of A-S of 4 with the highest 
score being 6 in that category. The medium sub- 
jects in this study had a median of 13, with the 
lowest score" being 11. Thus, Berkowitz’ low- 
prejudiced group was comparable to the medium- 
prejudiced group in this study. His findings might 
indicate differences between high- and medium- 
prejudiced individuals rather than between high- 
and low-prejudiced individuals. Such an interpre- 
tion is consistent with the findings of the 
present study. 

It is interesting to note, however, that although 
the absolute differences in rating of two strangers 
for high- and low-prejudiced subjects were simi- 
lar, the expression of negative feelings occurred 
at different points in time. High-prejudice people 
expressed more negativity immediately following 
frustration whereas the low-prejudiced people 
expressed more negativity toward the ‘second 
person rated. It appears that these differences in 
liking and disliking others in the room were the 
result of general arousal of negative feelings, 
rather than veridical responses to any real per- 
sonality differences between the individuals rated. 
Furthermore, in both cases, these changes in the 
negative direction seemed to represent general 
shifts in the judgmental processes rather than 
being confined to a few specific adjectival 
judgments. 
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PERSPECTIVE AS AN INTERVENING CONSTRUCT IN THE 


JUDGMENT OF ATTITUDE STATEMENTS? ~“ 


THOMAS 


M. OSTROM 
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into account by the judge serves as 


A perspective theory of social judgment states that the range of stimuli taken ' 


a potent determinant of reference-scale 


phenomena. The judge anchors the ends of his reference scale with the extremes 
of his perSpective. This study investigated perspective end anchors for,2 levels 
$ : of judge involvement. White and Negro college students judged statements of 
attitude toward the Negro. The results support the perspective anchoring 


interpretation. 


In recent years research dealing with the 
judgment of attitude statements has turned 
from its initial methodological orientation to 
the more comprehensive task of investigating 
the processes involved in the judgment of 
social stimuli, Several theoretical positions 
have evolved to account for findings in this 
area. Three which have received a great 
amount of attention are adaptation-level 
(Helson, 1964), assimilation-contrast (Sherif 
& Hovland, 1961), and Volkmann's (1951) 
"rubber-band? theory. Volkmann's position 
has subsequently been expanded by Upshaw 
(1962, 1965) and termed the variable per- 


. spective model, Each of these three orienta- 


tions employ the underlying construct of the 


- reference scale. They view the judging indi- 


vidual as evaluating social stimuli in the con- 
text of a personal frame of reference. 

These three theories differ in terms of the 
way they attempt to characterize the refer- 
ence scale, They attach importance to dif- 
‘ferent aspects of the reference scale in at- 
tempting to account for individual differences 
in social judgments. The prime determinant 
for Helson is the value which characterizes 
the center of the reference scale; for Volk- 
mann and Upshaw it is the values which lie 
at the ends of the reference scale; and for 
Sherif and Hovland it is the location on the 
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reference scale of the judge's own attitude 
and the level of his involvément in the issue. 
As a consequence of the theoretical impor- 
tance attached to the reference scale in the 
area of social judgment, it is desirable to 
adopt a standard language for clarity of 
communication in@discussing reference-scale 
phenomena, . , 
Origin and Unit A : 

Torgerson (1958, p. 79) has urged a- 
change in terminology ‘when dealing with 
absolute scales or reference scales. He re- 
places “shift in the absolute scale" with a 
change in origin of the absolute scale, and he 
replaces “expansion or contraction of the 
absolute scale" with a change in unit of the 
absolute scale, His terminology permits better 
specification of reference scale differences be- 
tween individuals or between groups. of indi- 
viduals. The origin of a reference scale is 
directly reflected by the value of the stimulus 
placed at the midpoint of the scale (analo- 
gous to Helson's "adaptation level"), and 
inversely reflected by the average categoriza- 
tion of all stimuli presented for judgment. 
For a standard set of stimuli, the higher the 
origin, the greater the number of stimuli 
which are judged below the midpoint of the 
absolute scale. The unit of a reference scale 
refers to the range of stimuli placed in any 
particular judgmental category; the wider 
that range of stimuli, the larger the unit 
of the scale. A judge with a large unit 
(expanded absolute scale) concentrates a 
standard range of stimuli into fewer cate- 
gories of his reference scale than does a judge 
with a small unit. This is assuming that the 
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number of categories is constant for all 
judges. Origin and unit comprise the two 
parameters necessary to describe differences 
in judgmental behavior. 

The origin parameter has been widely 
studied in the context of social judgment (see 
Sherif & Hovland, 1961), but the unit pa- 
rameter has received much less attention. 
Upshaw (1964) has called attention to this 
disparity and has made a strong case for the 
importance of investigating judgmental unit in 
social psychological studies. He has shown 
that, in the Thurstone equal-appearing in- 
tervals judgment task (Thurstone & Chave, 
1929), the judge's attitude is related to the 
size of the unit the judge employs when 
evaluating the attitude statments on a pro 
to con dimension. In the judgment of state- 
ments reflecting attitude toward the Negro, 
Upshaw found that anti-Negro judges em- 
ployed a larger unit than did pro-Negro 
judges, and the unit for neutral judges was 
intermediate. This finding is consistent with 
data presented by Zavalloni and Cook (1965) 
who used the same item pool and judging pro- 
cedure. Although the findings of these two 
studies support one another, this does not 
indicate that there is any simple relationship 
between unit and attitude. Manis (1960), in 
a study dealing with statements of attitude 
toward fraternities, found what is here inter- 
preted to indicate a curvilinear relationship 
between unit and attitude. His extremely pro- 
and extremely anti-fraternity judges employed 
a smaller unit than his neutral judges. These 
findings indicate a need for theories in social 
judgment to. take the unit parameter into 
consideration. 


Model 


Of the three theoretical orientations dis- 
cussed earlier, only Upshaw (1965) has pre- 
sented a conceptual model for social judgment 
which explicitly incorporates the unit as well 
as the origin parameter. His model is based 
on the construct of perspective and stems 
from an earlier presentation by Volkmann 
(1951) who introduced perspective as being 
the range of stimuli which the judge takes 
into account when performing the absolute 
judgment task. The defining stimuli of this 
range are the end stimuli. 
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Upshaw employs these perspective indices 
in the formation of his variable perspective 
model for determining absolute judgments, | 
He postulates that every judge anchors (or. | 
defines) the end categories of the absolute , 
scale with the end stimuli of his perspective. 
These end stimuli or.perspective end anchors 
represent the two extremes of the judge's 
perspective in the judging situation and 
serve to anchor the two extremes of the 
judge's reference scale. In the attitude rating 
task, each judge is pictured as establishing 
his own perspective with what he thinks is 
a maximally pro-attitude statement and what 
he considers to be a maximally anti-attitude 
statement. These two statements constitute 
the two extremes of the judge's perspective, 
The pro statement is postulated as anchoring’ 
the most pro-rating category, and the anti 
statement as defining the most anti-rating 
category. 

It is Upshaw's contention that the origin 
and unit of a judge's reference scale can be 
determined from knowledge of the judge's 
perspective end anchors. In terms of the vari- 
able perspective model, ihe origin of the 
judge's reference scale is a function of the 
sum (or average) of the stimulus values of 
the two perspective end anchors, whereas the » 
unit is a function of the difference between 
the perspective end anchors. 

The perspective end anchors are viewed as 
being modifiable. There are many potential 
determinants of the location of perspective 
end anchors. In the judgment of attitude 
statements, the range of stimuli presented for 
judgment is one such determinant of per- 
spective. In the absence of other influences, 
a judge tends to adopt the end stimuli pre- 
sented for judgment as his perspective end 
anchors (Volkmann, 1951). Upshaw (1965) 
has presented evidence showing differences in 
unit and origin between groups presented with 
differing ranges of attitude items to judge. 
He suggested that item range directly affects 
the judge's perspective end anchors, which, 
in turn, through the variable perspective 
model, determine the origin and unit of the 
reference scale. 

A variable which has received a great 
amount of attention in the area of social 
judgment is ego involvement. This variable 


PERSPECTIVE AND ATTITUDINAL JUDGMENT 


is of central importance in the assimilation- 
contrast model of Sherif and Hovland (1961). 
Of prime consideration in this present 
orientation is the effect of ego involvement 


- on the origin and unit of a judge's reference 


scale. In this regard, it should be noted that 
Ward »(1962) observetl*differences in origin 
between differentially involved judges who 
rated attitude statements, His judges were 
matched in terms of own attitude position. 
The data he presented did not indicate a 
difference in reference-scale unit between the 
several levels of involvement. In his discus- 
sion Ward suggested that involvement affects 
the strength, or potency, of the end anchor. 
Unfortunately, his analysis did not provide 
a test of this suggestion. 


Purpose 


The variable perspective model advocated 
by Upshaw attributes all reference-scale dif- 
ferences among judges to differences in per- 
spective, Tf, therefore, involvement affects 
any property of the reference scale, it must, 
according to the ,model, be mediated by 
perspective. The present study consisted of a 
dual manipulation, First, item range, which 
Upshaw has found to affect both reference- 
scale origin and unit, was varied. Second, 
judge's involvement was varied by means of 
selecting white and Negro judges to rate 
items relating to attitude toward the Negro. 
This manner of varying involvement is the 
same as that used by Hovland and Sherif 
(1952). 

Perspective end anchors are defined as the 
‘stimuli which represent the extremes of the 
absolute judgment rating scale. A direct index 
of these end anchors should be provided by 
asking the judge to write a pair of attitude 
statements which would servé to define the 
two extreme rating-scale categories. An analy- 
sis of the extremity of these written state- 
ments should provide direct information on 
how perspective end anchors respond to a 
manipulation attempt. 

This study examines the effect of item 
range on perspective end anchors for each of 
two levels of judge involvement and the cor- 
respondence between the origin and unit of 
a reference scale and the perspective end 
anchors. The expectation of the variable 
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perspective model is that, as the range of 
items is extended, the corresponding perspec- 
tive end anchors should be extended. The 
degree to which the judge is involved in the 
attitude issue is not expected to affect this 
prediction. The variable perspective model 
would suggest that the effects of ego involve- 
ment on absolute judgments are mediated by 
the judge's perspective. Although differences 
in width of perspective may appear between 
high- and low-involvement conditions, the 
manipulation of item range should affect the 
perspective of both involvement levels alike. 

The origin and unit of the reference scale 
used in judging the attitude items should be 
consistent with the observed perspective end 
anchors, The origin should be a function of 
the sum of the perspective end stimuli and 
the unit a function of the difference between 
the perspective end stimuli. The results of 
both the high- and low-involvement groups 
should correspond to this prediction. 


METHOD 
Procedure 


The manipulation of item range was based on.the 
procedures used by Ferher (1952) and Upshaw 
(1962). A pool of 40 statements of "attitude toward 
the Negro" was assembled. The pool contained 
statements ranging in opinion from white supremacy 
to black supremacy. Of the 40 items, 30 were taken 
from the set constructed by Hinckley (1932) and 
used by Hovland and Sherif (1952) and Upshaw 
(1962) in earlier studies in this area. These 30 items 
were selected on the basis of scale values and Q 
values determined by Upshaw (1962). The items 
covered the entire attitude range of the Hinckley 
set (white supremacy to equality) with minimum Q 
values. The remaining 10 items in the pool expressed 
attitude positions between equality and black 
supremacy. A set of 39 items in this range was 
collected. Some were based on current Negro periodi- 
cals and some were taken from Ward (1962). 
A small group of students (7— 12) judged these 
statements along with 15 representative Hinckley 
items to determine approximate scale values and 
Q values. The 10 selected items were all judged 
more favorable than the highest Hinckley item, 
represented the entire range from equality to Negro 
superiority, and had minimum Q values. These items 
were added to the 30 Hinckley items to insure that 
the attitudes of all judges used in the study were 
represented in the 40-item pool. 

Three item ranges were constructed from the item 
pool The a-- range covered positions from white 
supremacy to equality (the 30 Hinckley items) ; the 
t range, from white supremacy to black supremacy 


The item judgment task formed the second part 
of a two-part booklet, The first part required the 
subjects to judge density of a set of rectangular 
stimuli. But, since the first part was not varied 
between conditions, it is not material to the problem 
under consideration here. Three separate forms of 
the booklet were prepared, distinguished by the 
range condition appearing in the second part. 

The order of the items in the booklet was ran- 
domly determined except for the first three items in 
each range condition. The first three items were 
selected to represent the midpoint and approximate 
extremes of that particular range condition. Each 
item in the booklet was accompanied by an 11- 
interval rating scale. The instructions were basically 
the same as used by Hinckley (1932), Hovland and 
Sherif (1952), and Upshaw (1962). Subjects were 
asked to mark the interval which they felt to be 
most appropriate for each attitude item, Interval 1 
indicating a low social position and Interval 11 
indicating a high social position. 

On the last page of each booklet the subjects 
were asked to: write the interval number, 1-11, 
which best described their own opinion toward the 
social position of the Negro; write an attitude 
statement which best describes Interval 11; and write 
an attitude statement which best describes Interval 1. 
The statements which the judges wrote at this point 
shall be referred to as the pro- and anti-perspective 
statements. The subjects were cautioned against 
using any statement which appeared in the judging 
task when answering these two questions. 

The variable of ego involvement was manipulated 
by the use of criterion groups. The same type of 
groups were selected as Hovland and Sherif (1952) 
used in their manipulation of ego involvement. The 
low-involvement group was composed of white 
undergraduate students at a large southern univer- 
sity. This group is comparable to Hovland and 
Sherif’s “average” white subjects. The high ego- 
involvement subjects were undergraduates at a small 
southern state-supported college for Negroes. This 
high-involvement group is comparable to Hovland 
and Sherif’s “younger group” of Negro judges. 

In total, 129 white undergraduate students from 
the University of North Carolina and 121 Negro 
undergraduate students from North Carolina College 
served as judges. The experimental booklets were 
administered in the classroom in random order. The 
booklet administrator for the Negro subjects was 
their class instructor, a Negro, and for the white 
subjects, a white graduate student. The data of one 
judge was eliminated from all analyses because he 
assigned all items to the same category and did not 
complete the last page. 


Evaluation of Perspective Statements 


For purposes of analysis it was necessary to deter- 
mine the location of each subject’s perspective state- 
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“ments on the attitude continuum. Three judges were 


used in this phase of the experiment. None of these 
judges was familiar with the experimental hypothe- 
ses. The pages on which the statements had been 
written in the main experiment were removed from 
the booklets and arranged in alternating order by 
experimental condition. The statements were given 
to the judges in this form, along with a page of 
written instructions. ` 

The judges were required to make two judgments, 
Some of the experfmental subjects, instead of writing: 
attitude statements, wrote comments, expressions of 
their own views, or omitted that part of the task 
completely. It was therefore considered necessary fo 
eliminate the irrelevant statements for purposes of. 
testing the hypotheses. The judges first decided if 
the statement was relevant to the task; if so, they 
then judged it in terms of a set of ordered categories, 

Two sets of categories were provided, one for 
pro-Negro statements and, the other for anti-Negro 
statements, The ordered categories for the pro items 
were defined as follows: qualified equality or lower, 
1; unqualified equality, 2; qualified Negro süperior- 
ity (Negroes are superior in some respects and equal 
in others), 3; and unqualified Negro superiority, 4. 
These categories were so constructed to permit the 
identification of perspective beyond equality, the 
most pro position in the a+ range condition. 

The categories for the,anti-Negro statements Were 
constructed in a different manner, Instead of supply- 
ing a category label as was done for the pro cate- 
gories, the boundaries between adjacent categories 
were defined. This was done through the use of the 
following three attitude items: “I place the Negro on 
the same social basis as I would a mule,” A; “A ‘nig- 
ger’ in his place can be tolerated, but as the social 
equal of the white man he cannot be endured,” Bj 
and “A Negro makes a good chauffeur but an impos- 
sible secretary,” C. Category 4 was defined to contain 
any statement less favorable than Item A, Category 9 
fell between Items A and B, Category 2 between 
Items B and C, and the first category covered all 
statements more favorable than Item C. Item À 
represents the lowest position in the a+ and the 
t range conditions. In terms of the scale value. 
determined by Upshaw (1962), Item B is one scale 
unit higher (more pro-Negro) than Item A, and 
Item C is two scale units higher than Item A. 

A two out of three consensus procedure was use 
to determine both relevancy and scale position. If à 
statement was called irrelevant by at least two of 
the three judges, it was labeled irrelevant and not 
used in the later analysis. Of the remaining relevant 
statements, all those which did not meet the criterion 
of having at least two judges place them in the 
same category were discarded. 


RESULTS 


A breakdown by end of scale and degree of 
consensus is presented in Table 1 for the pro- ^ 
cedure used in evaluating the extremity 9 
the perspective statements, The expected pel 
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centages are based on the assumption that 
the judges assigned the statements to the 
five categories (the fifth category being 


. “irrelevant”) on a purely chance basis. For 


example, the chance probability of obtaining 
no consensus for any given statement, .48, is 
determined by multiplying the probability 
that the second judge will use a category 
different from the first judge (.80) by the 
probability that the third judge will use 


eneither category employed by the first two 


judges (.60). 

The procedure resulted in achieving over all 
statements consensus for a total of 87.976 
of the statements, or 441 out of the total 502. 
An inspection of Table 1 reveals that the 
judges had a high degree of agreement and 


- that, despite the differences in the nature of 


the pro- and anti-judging categories, the 
amount of consensus was nearly identical for 
each end of the scale. This suggests that the 
scores assigned to the perspective statements 
are reasonably reliable and can be validly 
used in further analysis. 


Perspective Statements 


Upshaw (1965) has shown with a similar 
manipulation of item range that the origin 
of the scale is more pro-Negro in the a— 
item range than the a+, with the t range 
intermediate, Also Ferher (1952) found the 
same influence on origin when item range 
was manipulated. The origin moves away 
from the aborted end of the scale. The unit 
was shown by Upshaw (1965) to be largest 
in the t item range than in either of the 


` aborted ranges. To account for these findings, 


the variable perspective model predicts that 
the judges? perspectives should extend highest 
in the t and a— item ranges at the pro end 
of the scale, and lowest in the a+ and t item 


TABLE 1 


PERCENTAGE OF PERSPECTIVE STATEMENTS RECEIVING 
DIFFERING DEGREES OF CONSENSUS 


Degree of consensus 


End of scale 
Total IT wo of three None 
Pro 29.5% | 598% | 10.7% 
Anti 331% | 534% | 13.5% 
13.0% | 480% 


Expected by chance | 4.0% 
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TABLE 2 


ANALYSIS OF VARIANCE OF Pro-NEGRO PERSPECTIVE 
STATEMENTS AS A FUNCTION OF RACE AND 


Irem RANGE 
Source df MS F 

Race (R) 1 3.308 3.43* 
Item range (I) 2 7.557 7.83** 
RXI 2 0.349 <1 
Within cells 181 0.965 

"Total 186 

* 410. 

=p io 


ranges at the anti end of the scale. The per- 
spective end anchors should be directly 
related to the extreme stimuli of the item 
ranges. 

A separate analysis of variance was done 
on the pro- and anti-perspective statements. 
Since the number of statements which were 
discarded as being irrelevant or for lack of 
consensus varied from cell to cell in the de- 
sign, the analysis was performed with unequal 
cell frequencies. The harmonic mean solution 
presented by Winer (1962) was used. The 
harmonic means of the cell frequencies for 
the pro and anti analyses were 30.35 and 
29.93, respectively. The summaries of these 
analyses are in Tables 2 and 3. 

The Race X Item Range interactions for 
both analyses result in F ratios of less than 1. 
With the exception of the race effect in the 
pro analysis, all other effects reach the .01 
level of significance. The race effect in the pro 
analysis is marginally significant ($ < .10). 
The cell means from the two analyses of 
variance are presented in Figure 1. 

Figure 1 shows that the white judges, in 
comparison with the Negro judges, wrote per- 
spective statements more pro Negro at the 


TABLE 3 


ANALYSIS OF VARIANCE OF ANTI-NEGRO PERSPECTIVE 
STATEMENTS AS A FUNCTION OF RACE AND 


Irem RANGE 
Source df MS F 
Race (R) 1 15.727 19.63 
Item range (I) 7 9.318 11.63%" 
RXI 2 0.700 «1 
Within cells 167 0.801 
Total 172 


**p <01, 
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Fic. 1. Mean placement of pro- and anti-perspective 
statements as a function of race and item range. 


pro end and more anti Negro at the anti 
end of the attitude continuum. This difference 
could, although need not necessarily, be inter- 
preted as a difference in perspective between 
the two races. The measure of perspective 
used is dependent upon verbal facility and 
cultural modes of expression. 

The absence of a significant interaction in 
either the pro or anti analysis indicates that 
the manipulation of item range did not have 
a differential effect between the two races. 
Regardless of possible initial differences in 
perspective, each race responded in the same 
way to the item-range variations. 

As predicted by the variable perspective 
model, the item-range main effect was signifi- 
cant for both pro- and anti-perspective state- 
ments. The pattern of means in Figure 1 
shows that the a+ range is the least extreme 
of the conditions at the pro end, and the a— 
range is the least extreme of the conditions 
at the anti end. This is the pattern which was 
predicted. Table 4 presents the individual 
comparisons (Winer, 1962) within the item- 
range main effects which were predicted to 
be significant. Clear significance was attained 
in three of the comparisons and marginal 
significance for the remaining a+ versus 
a— at the pro end. The variable perspective 
predictions were substantially confirmed. 

The variable perspective model made no 
predictions regarding the remaining main- 
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effect comparisons. The multiple comparison 
fs were 2.42 for the pro items’ a— versus 
t comparison and 1.77 for the anti items’ a+ ` 
versus t comparison. The two comparisons are 
significant at the .05 and the .10 levels, re- | 
spectively, using a two-tailed test. These 
comparisons, although post hoc in nature, 
suggest that restriction of an item range at 
one end of the ‘attitude continuum leads to 
less extreme anchors at both ends of the 
scale. 


Reference Scale Origin and Unit 


A critical test of the variable perspective 
model is whether the origin and unit pa- 
rameters of the six subgroups correspond to 
the findings of the perspective statements, To 
provide this test, scale values were deter- 
mined for the 20 attitude items which were 
common to all three item ranges. Separate 
item medians were computed within each of 
the six subgroups in the study. The medians 
were based on sample sizes which varied be- 
tween 37 and 44 subjects. An analysis of 
variance of these medians provided a test for 
the predictions regarding origin and unit. The 
analysis was composed of three main effects 
(item range, race, and scale position) and 
their interactions. "The item range effect had ` 
three levels (a+, t, a—) and the race effect 
had two levels (white and Negro). The scale 
position effect was determined by ordering 
the 20 common items by their mean scale 
position over the six subgroups, and then 
partitioning them into 10 pairs of adjacent 
items. These 10 item pairs formed the 10; 
levels of the scale position main effect. Item 
pairs, nested within scale position, were con- 
sidered random, The three other main effects 


* TABLE 4 


INDIVIDUAL Comparisons OF PERSPECTIVE STATE- 
MENTS FOR THE ITeM-RANGE Main EFFECTS 


Perspective end and comparison af 2 
Pro 
a+ versus t 181 3.39 
a+ versus a— 181 1.50* 
Anti 
a— versus t 167 4.768 
a— versus a+ 167 2.99%" 


*p «.10, one-tailed. 
+*+ p <.01, one-tailed. 
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were considered fixed (Green & Tukey, 1960). 
Upshaw (1965) presents a full discussion of 
the reasoning upon which the tests of origin 
and unit are based. A summary of that logic 
precedes each of the predictions regarding the 
two reference-scale parameters. 

Differences in origins of a reference scale 
are inversely related to the average placement 
of common items. As the ofigin is more pro 
Negro, a greater number of items are seen 
eas anti Negro by the judge. Since the origin 
is a function of the average of the two per- 
spective end anchors, the common items were 
expected to be judged the most pro Negro in 
the a+ item range and the least pro Negro 
in the a— range, with the t item range inter- 
mediate, This orderin% should appear for each 
-race. A Race X Item Range interaction would 
not be expected on the basis of the per- 
spective statement analysis, since the Race 
X Item Range interaction was not significant 
for the statements at either end of the scale. 

The reference-scale unit is reflected by the 
dispersion of scale values assigned to the 
common items, A wide perspective is assumed 
to be associated *with a large unit, hence 
dispersion of the items. Dispersion differences 
would be revealed in the interaction of the 
scale position with other effects. Based on 
the information from the analysis of per- 
spective statements, a significant Item Range 
X Scale Position interaction was expected. 
The dispersion of the item medians was 
predicted to be lower for the t item range 
than for either of the other two ranges since 
the t item range judges indicated a wider 
perspective than the other two item ranges. 
This Item Range X Scale Position interaction 
was expected to be of the same form for 
each race. 

The summary table for the analysis of vari- 
ance of item medians is presented in Table 5. 
The three main effects are significant at the 
.01 level and the Race x Item Range and 
the Item Range X Scale Position interactions 
reach marginal significance ($ < .10). 

The pattern of average item medians for 
the six experimental conditions can be seen 
in Table 6, The significant item-range main 
effect is as predicted, the a+ range having 
the highest mean and a— the lowest mean. 
The marginally significant Race X Item 
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TABLE 5 


ANALYSIS OF VARIANCE OF COMMON ITEM MEDIANS AS 
A Function or Item RANGE, RACE, AND 
SCALE POSITION 


Source df MS F 

Race (R) 1 20.156 11.75% 
Item range (I) 2 5.306 9.80** 
Scale position (S) 9 58.529 56.19%" 
RXI 2 0.846 3.02* 
RXS 9 1.328 0.77 
IXS 18 1.016 1.88* 
RXIXS 18 0.293 1.05 
Item pairs (P) 10 1.042 
RXP 10 1.716 
IXP 20 0.541 
RXIXP 20 0.280 

"Total 119 

* 5 «.10. 

*p 501 


Range interaction appears to be due to the 
origin of the Negro judges being affected 
to a greater degree by the item-range ma- 
nipulation than the origin of the white 
judges, This difference between the races 
was not anticipated from the perspective 
statements analysis. 

The means for the Item Range X Scale 
Position interaction are presented in Table 7. 
Tf the interaction is due to the differences in 
unit between the several item ranges, each 
item range should be linearly related to the 
other ranges. Scatterplots were drawn and 
linear correlation coefficients were calculated 
for each pair of item ranges over scale po- 
sition, By inspection, the scatterplots indicate 
a linear relationship between the item ranges. 
The correlation between a+ and t was .99, 
between a— and t was .94, and between a+ 
and a— was .96. The linear scatterplots and 
the high linear correlations indicate that the 


TABLE 6 


MEAN IrEM MEDIANS AS A FUNCTION OF 
Race AND Item RANGE 


Item range 
Race M 
a+ t a= 
White 6.86 6.61 6.42 6.63 
Negro 8.02 7.31 7.03 1.45 
M 744 6.96 6.72 7.04 


Note.—Higher values are associated with the pro-Negro end 
of the scale, 
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i TABLE 7 


MEAN Trew. MEDIANS AS A FUNCTION OF ITEM 
t RANGE AND SCALE POSITION 


Item range 
Scale position 
ad t a— 
i 4.01 3.46 3.22 
2 4.69 4.50 3.62 
3 5.47 5.19 4.47 
4 7.01 6.51 4.61 
5 7.63 6.68 7.09 
6 8.24 7.10 8.59 
7 8.46 8.17 8.34 
8 8.98 8.68 8.79 
9 9.56 9.35 9.03 
10 10.34 9.95 9.48 
M 7.44 6.96 6.72 
SD 2:12 2.13 247 


marginally significant Item Range X Scale 
Position interaction is due to differences in 
unit, Inspection of the standard deviations of 
the common item medians in Table 7 shows 
the a— item range with a higher SD than 
either of the two ranges. There is no differ- 
ence in SD between the a+ and t item 
ranges. It was predicted that the a— range 
should have a smaller unit (larger SD) than 
the t range. It was also predicted that the 
a+ range should have a smaller unit than the 
t range. The former was supported, the latter 
was not supported. The absence of a three- 
way interaction indicated that the units of 
both races were similarly affected by the 
item-range manipulation. 

Differences between the races in origin and 
unit would be reflected by the race main 
effect and the Race X Scale Position inter- 
action. No predictions were made about race 
differences on the basis of the perspective 
statements, The race main effect was signifi- 
cant, with Table 6 showing the Negro judges 
having a higher mean item placement than 
the white judges. Scale values for the two 
races appeared to be linearly related. The 
scatterplot appeared linear and the product- 
moment correlation over scale position was 
.98. The SD over scale position for white 
judges was 2.00 and for Negro judges was 
2.45. Although this difference between stand- 
ard deviations appears large, the insignificant 
Race X Scale Position interaction indicates it 
cannot be considered interpretable, 
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Discussion 


The variable perspective model postula: 
that a significant determinant of the unit a 
origin of a reference scale is the judge's per- 
spective. In the absence of other influen 
the judge will adopt the end stimuli of 
item range as his perspective end anchors; 
This expectation was consistent with findin 
from other studies (Ferher, 1952; Upsha 
1965). The analysis of perspective statement 
fully substantiated this expectation for bol 
races. The manipulation of item ranges 
found to directly affect the extremity of 
perspective statements. The manner in. whic 
this manipulation affected the perspecti 
statements was identical for both the 
and low ego-involvement conditions. Th 
finding provides strong support for the a 
ceptance of perspective as an intervening 
construct in social judgment. ; 

One other determinant which could have 
affected the predictions regarding perspective 
end anchors was the Jocation of the judges' 


of the scale with his own attitude rather than 
the end stimuli of the item range. The pel 
spective end anchor corresponding to 
attitude of the out-of-range judge would n 
be affected by item-range variations at 
end of the scale. Presence of judges who: 
own attitude lay at extreme white supremac 
or at extreme black supremacy would th 
attenuate the test of the variable perspectiv X 
predictions. Although Negro judges rate 
themselves significantly more pro Negro (| 
« .001), very few indicated attitudes beyond, 
racial equality. It was found in the a+ ra 
condition that 3 of 28 Negro subjects spo 
taneously wrote black supremacy perspec 
statements, Comparable estimates would not: 
be appropriate for the other two item-rang 
conditions in that black supremacy state- 
ments were presented in the judging task. At 
the level of informal evidence, postexperi 
mental talks with the subjects and with the 
instructors indicated that an  exceedingl 
small number of the subjects could hav 
possibly had such extreme attitudes. 
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The prediction of origin on the basis of 
the earlier studies (Ferher, 1952; Upshaw, 
1965) and the perspective statements analy- 
sis of this study was upheld for both races 
in the analysis of the common item medians. 
The origin of the Negro judges was affected 
somewhat more strongly by the item-range 
manipulation than was the origin of the white 
judges. This difference could be the result of 
the Negro judges áccepting the end stimuli 
of the item range as perspective anchors more 
readily than the white judges. If this was the 
case, the difference was not reflected in the 
perspective statements analysis. 

The findings with regard to unit were not 
wholly consistent with the expectations based 
on the earlier studies or the perspective state- 
' ments analysis. The a— item range had a 
smaller unit than the t range, as expected, 
but there were no differences between the t 
and the a+ item ranges, The a+ range was 
predicted to have a smaller unit. From in- 
spection of individual item distributions, it 
appeared that some ‘judges in the t range 
condition continued to define the highest 
category as representing equality between the 
races, This occurred despite the inclusion of 
black supremacy items. In fact, some judges 
used the low categories for items expressing 
both white and black supremacy. These 
judges seemed to have changed the judgment 
continuum from a low to high social position 
dimension to a dimension of racial domina- 
tion to racial equality. Other judges who de- 
fined the highest category as racial equality 
placed the black supremacy items in that 
same category. The former bias (redefinition 
of the judgment continuum) would lead to 
the expectation of no difference in unit or 
origin between the a+ and t item ranges. 
The latter bias (not discriminating between 
black supremacy and equality items) would 
produce little effect on unit, but the observed 
effect on origin. (See Johnson, 1944, on the 
effect of distribution skew on absolute judg- 
ment.) Perhaps a more conventional item 
pool would have circumvented these problems 
caused by idiosyncratic definitions of the 
judgmental continuum. 

The findings of the perspective statements 
were not used as a basis for predicting origin 
and unit differences between the high and 
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low ego-involvement groups. There are some 
difficulties with asking the subjects to write 
an attitude statement and using this as an 
index of the subjects’ perspective end anchor. 
This index may well be affected by differences 
in communication style, cultural standards 
concerning modes of expression, and other 
such factors, The possible presence of this 
type of variable would have only random ef- 
fects on the item-range conditions, but would 
be a confounding factor with the race dif- 
ferences, Since there was no way to eliminate 
such potential Negro versus white differences, 
no comparisons between the two races were 
made on the basis of the perspective state- 
ments analysis. 

It» comparing the two races on the basis 
of the origin and unit of their reference 
scales, it was found that no significant dif- 
ference in unit appeared and that the white 
judges had a higher origin than the Negro 
judges. The judgments of the high-involve- 
ment group were found to be linearly related 
to those of the low-involvement group, with 
the only reference-scale differences appear- 
ing in the origin parameter. The appearance 
of this difference for each item range suggests 
that the two races had different perspectives 
prior to engaging in the experimental task. 

The difference in origin between the Negro 
and white judges was greatest in the a+ 
item range. This was initially surprising, in 
that the a+ item range condition comprised 
a replication of Hovland and Sherif (1952) 
who found differences between the races that 
were exactly opposite in direction. They sug- 
gested that Negro judges, being highly in- 
volved in the issue and holding extreme atti- 
tudes, displace the attitude items away from 
their own position and toward the anti-Negro 
end of the scale. Their data indicated that 
Negro judges did tend to lump items at the 
anti-Negro end of the scale, when compared 
to average white judges. This ego-involvement 
interpretation was supported in a study by 
Ward (1962) who compared a group of civil- 
rights pickets and a group of nonpickets with 
regard to their judgment of a set of Negro 
attitude statements. He used all white judges 
and matched the two groups as to attitude 
toward the Negro. His group of pickets gave 
less favorable judgments to the set of attitude 
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items than did the nonpickets. The data of 
the present study indicate that, over the space 
of 10 or 11 years, the race difference ob- 
served by Hovland and Sherif has reversed 
itself. The white judges rated the items more 
anti-Negro than the Negro judges. The vari- 
able perspective model would suggest that 
these opposite findings might be reconcilable. 
It is conceivable that the perspective of the 
white Southern college student has undergone 
a considerable expansion since 1952. He may 
“take into account” more extreme pro-Negro 
attitudes when anchoring the pro end of his 
reference scale. If this were the case, a 
common set of items would be judged less 
favorable today than in 1952. Pairing this 
with the supposition that events in race’ rela- 
tions have changed the Negro perspective less 
than that of the white, the discrepant findings 
of the two studies could be brought into 
accord. An investigation is underway to test 
the validity of this variable perspective 
deduction. 

The apparent conflict between the data of 
this study and that of Hovland and Sherif 
(1952) and Ward (1962) suggests that there 
are sources other than ego involvement which 
contribute to such criterion group differences 
in social judgment. 
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Groups composed of people with similar (homogeneous) and dissimilar 
(heterogeneous) personalities were compared for their attractiveness for their 
members at the “sth, 8th, and 11th wk. after their formation. Analyses of 
variancesof ratings and choice measures showed no significant differences in 


attraction to the 2 types of groups and no variations in these differences over 


time. Rather substantial individual and group differences within each type 
were revealed. The level of attraction rose from the 5th to the 8th wk., then 
was unchanged or dropped slightly by the 11th wk. The results replicate 
Hoffman's earlier results in disconfirming the proposition that similarity of 
personality produces interpersonal attraction in problem-solving groups. 


Considerable evidence has been accumu- 
. lated to support the contention that people 
with*similar personalities are more attracted 
to each other than are people with different 
personalities (Izard, 1960; Rosenfeld & Jack- 
son, 1959). Most of this evidence has been 
derived from correlational studies which show 
that people who choose each other on a socio- 
metric measure have more similar personali- 
ties than do pebple who are not mutual 
choices, These results have led to the generally 
accepted causal hypothesis that people with 
similar personalities are attracted to each 
other. 

Only two experimental tests of this hy- 
pothesis have been reported, however, with 
conflicting results. Newcomb (1961) reported 
that students in a residential housing unit 
who had initially similar attitudes on certain 

_ issues indicated more liking for each other at 

. the end of the experimental period than did 
people with dissimilar attitudes. Hoffman 
(1958), on the other hand, reported that in 
problem-solving groups, members of groups 
with similar personalities. (homogeneous 
groups) showed no more attraction to their 
groups than did members of groups with dis- 
similar personalities (heterogeneous groups). 
The findings in that study suggested, however, 
that the bases for attraction in the two types 
of groups were different. 

1 The research reported here was done in connec- 
tion with Grant M-2704, from the National Insti- 
tutes of Health, United States Public Health Service. 
A shorter version was presented at the meetings of 
the Eastern Psychological Association, April 28, 1962. 


The present study was designed to repli- 
cate, and elaborate somewhat, Hoffman's 
(1958) experiment of the effects of person- 
ality similarity or dissimilarity on the attrac- 
tion of members for their problem-solving 
groups using a more detailed set of measures. 
Measures were taken at three times during 
the semester to examine changes in attraction 
over time. Also, each time subjects were asked 
about their liking for the other members of 
the group and about the others’ problem-solv- 
ing abilities. 

This examination of the relative attractive- 
ness of homogeneous and heterogeneous groups 
assumes special importance in the light of the 
previously reported general superiority in 
problem-solving effectiveness of the hetero- 
geneous groups (Hoffman & Maier, 1961). 
We may ask whether such enhanced task per- 
formance was achieved at the expense of the 
group’s attractiveness for its members. 


METHOD 
Establishment of Experimental Conditions 


Students in the eight laboratory sections of an 
undergraduate psychology course in human rela- 
tions were assigned to four-person groups of either 
homogeneous or heterogeneous types, according to 
their score profiles on the Guilford-Zimmerman 
Temperament Survey (GZTS; Guilford & Zimmer- 
man, 1949). The GZTS is a paper-and-pencil self- 
report personality measure which provides 10 rela- 
tively independent measures of personality traits.? 


2For a more detailed description of the person- 
ality inventory and of the method of assigning sub- 
jects to groups, see Hoffman (1958). A comparison 
of the problem-solving effectiveness of the homo- 
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Homogeneous groups were composed of people with 
high positive correlations (Kendall taus) among 
their profiles while heterogeneous groups had peo- 
ple with near-zero correlations or, in a few cases, 
with combinations of high negative and positive 
correlations. This measure of personality similarity is, 
of course, only one of a number which might have 
been used. The profile correlation, assumes that the 
similarity of an ipsative scoring of the Survey is 
more indicative of the differential selí-perceptions 
of subjects than a measure which reflected the ab- 
solute level of scores on the different scales. Since 
the groups composed on this basis did differ in their 
problem-solving effectiveness, consistent with the 
assumed differences or similarities in their percep- 
tions (Hoffman & Maier, 1961), the relative at- 
tractiveness of the members for their groups com- 
posed on this basis seemed worthy of study. 

Groups were formed in the third week of the 
semester and interacted, in the weekly problem- 
solving and role-playing exercises, only with the 
members of their own groups. They had opportunity, 
however, to become acquainted with the other stu- 
dents in their laboratory section during the rest 
break and class discussion following each exercise. 
The discussion also revealed how each group ap- 
proached the problem and provided some basis for 
evaluating these approaches, although the instructors 
usually analyzed rather than evaluated the groups’ 
problem-solving process. Responses of the members 
of 15 homogeneous groups and 18 heterogeneous 
groups are analyzed in the present study. 


Administration of Sociometric Measures 


At the end of the 2-hour laboratory periods in the 
fifth, eighth, and eleventh weeks after the groups 
had been established—the eighth, eleventh, and 
fourteenth weeks of the semester—subjects com- 
pleted detailed sociometric questionnaires. Each sub- 
ject rated each of the other members of his own 
group on a 9-point scale—from “not at all” to “very 
much"—according to how much he liked him and 
again according to his problem-solving and role- 
playing ability, At the same time each subject chose 
three people from the entire laboratory section whom 
he would most like to have in his group, once be- 
cause he liked them and a second time on the basis 
of their problem-solving ability. 

Four measures of each member’s attraction to 
his group were obtained from these questions. 
Numerical values from 1 to 9 were assigned to each 
rating, representing increasing degrees of attraction. 
The mean of the three ratings by which each subject 
reported how much he liked the other members of 
his group was used as an index of his attraction to 
the group on the basis of liking (liking rating). 
The mean of his three ratings of problem-solving 
ability represented his attraction to the group on 
that basis (problem-solving rating). 

Two similar indexes of attraction were derived 


geneous and heterogeneous groups appears in Hoff- 
man and Maier (1961). 
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from each subject’s choices as a function of how 
often he chose his own group members from all the 
people in his laboratory section. Since the sections 
varied in size from 11 to 21,'an index of ingroup 
attraction was developed which reflected the greater 
mathematical probability of making an ingroup 
choice in the smaller sections.? The index varies from 
0 to 100 representing increasing degrees of attracti 


and on the problem-solving choices were computed. 
to measure each subject's attraction to his group on 
these two bases. 

Although these four measures of attraction to the. 
group—liking rating, problem-solving rating, liking 
choice, and problem-solving choice—were highly co; 
related in the heterogeneous groups, their moderate 
correlation in the homogeneous groups has prompted 
us to report their analysis separately (cf. Hoffman, 
1962). 

Design of Analysis+ — * 
Each of these four measures was subjected to à 
complex analysis of variance with the following 
sources of variance: between group types (homo- 
geneous versus heterogeneous), between sociometrics, 
Sociometrics X Type interaction; and, within types: 
between groups, Sociometrics X Groups interactions, 
between members in groups, and finally, within. 
members in groups (realiy Sociometric X Member 

interaction). 

The analytic design employed assumed that 
variable, group types, was assigned two fixed valu 
homogeneous and heterogeneous, while the other 
variables—sociometrics, groups, and  individuals— 
were each random samples from an unspecified 
population. This decision required the computati 
of estimates of variance components to generate 
appropriate error terms to test for the statistical 
significance of certain sources of variance. The F 
tests reported have been labeled as “quasi-F” by 
Winer (1962) in his discussion of the use of vari- 
ance component estimates for constructing appropri- - 
ate error terms, but they appear similar to the 
results obtained under the assumptions that all | 
variables but groups and subjects were fixed. 1 

i 
RESULTS 


The results of the analyses of variance of - 
the four measures of attraction to the group 
are presented in Table 1. Since the principal 
purpose of this study was to test the hypothe- 


3 Profound thanks are due A. Bruce Clarke of the 
Mathematics Department, University of Michigan, - 
for developing this index. For a more complete - 
description of this index and its properties, see Hoff- — 
man (1962). | 

*We wish to thank Charles Proctor of the Uni- 
versity of North Carolina and Paul Dwyer, Warren 
Norman, and Esther Schaeffer of the University of 
Michigan for invaluable assistance in the design of 
this analysis. 
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TABLE 1 
ANALYSES OF VARIANCE OF FOUR MEASURES OF ATTRACTION TO THE GROUP 
Ratings Choices 
Source df Liking Problem solving Liking Problem solving 
i us| r |MS| r MS F us| F 
Group type (A) . 1|0.70| 0.14 :|0,00| 0.02 4358| 1.75 5761| 1.95 
Errors ^ $ 4.92 | (df=27) | 3.42 | (af=22) |2487 | (df=10) |2944 | (df=25) 
Sociometrics® (B) 2 |4.36 | 10.14* |1.13 | 2.46 857| 1.84 1329| 3.34* 
BXA : 2|048| 1.12 0.12| 0.26 1032| 2.21 297| 0.75 
Between group in types (C) 31 |4.87| 2.18% |3.76| 1.98% |1921| 1.09 3045 | 1.98 
Errore 2:23 | (df=99) | 1.89 | (0/97) | 1750 | (0102) |1534 | (4f= 103) 
BXC 62 | 0.43 | 1.00 0.46 | 1.02 466| 1.30 398| 1.40* 
Between members in groups 99|223| 5.19 |1.88| 4.18** |1642| 4.58% | 1429 5.040 
and types 
Within members in groups 198 | 0.43 = 0.45 = 358 — 283 = 
and typest 
ur. 


a Error estimated from algebraic sum of Between Groups + Sociometrics X Type — Sociometric X Groups in Type (cf. 


Winey 1962, pp. 199-202). 
Tested by Sociometi 
o Error estimated from algebraic sum of Between 
pp. 199-202). 
d Used as error 
*p «.05. 
"p <01, 


rics X Groups in Type with df — 2/31. 
Mem! 


sis that similarity of personality leads to at- 
traction, the lack of significant F for group 
types on any of the four measures of attrac- 
tion requires rejection of this hypothesis. 
Table 1 shows also that the Sociometric x 
Type interaction was not a significant source 
of variance on any of the four measures of 
attraction either, There was no tendency for 
the members of homogeneous groups to be- 
come more attracted to their groups than 
were members of heterogeneous groups at 
any of the three points of measurement. 

Since neither of these two sources of 
| variance—between types oF Sociometric X 
Type interaction—approached statistical sig- 
nificance, these results confirm the earlier 
findings that in these types of problem- 
solving groups, people with similar personali- 
ties, as measured by the GZTS, were no more 
attracted to each other than were people with 
dissimilar personalities. The results were the 
same whether measured by ratings or choices 
or on the basis of liking or of problem-solving 
ability, 

The means and standard deviations for the 
two types of groups on each measure are 
shown in Table 2. Only on the choice mea- 
sures is there any suggestion of greater at- 
traction in the homogeneous groups, but the 


bers + Sociometrics X;Groups 


— Within Members (cf, Winer, 1962, 


to test Sociometrics X Group and Between Members sources. 


magnitudes of the differences are neither large 
nor maintained consistently. 

While the data tend to reject the similarity- 
attraction hypothesis, three additional analy- 
ses were performed as checks on these find- 
ings. The first asked whether the members of 
homogeneous groups, when they chose out- 
side their own groups, chose another person 
who was also similar to them, that is, substi- 
tuted another person who was equally similar 
to them. For each member of a homogeneous 
group, his similarity—the tau correlation of 
his personality profile—to his choices outside 
the group were compared with his similarity 
to the unchosen members of his own group. 

These means and standard deviations are 
reported in Table 3. It is clear that the homo- 
geneous group members did not choose out- 
side their groups on the basis of their simi- 
larity to their choices. On the contrary, their 
mean taus with the members of their groups 
whom they did not choose were substantially 
higher than were their mean taus with their 
outgroup choices, Their outgroup choices 
violated the similarity-attraction hypothesis, 
since the subjects were less similar in per- 
sonality to their choices than to their 
unchosen fellow group members. 

A second check of the hypothesis was made 


"X 
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TABLE 2 


MEANS AND STANDARD DEVIATIONS OF MEASURES OF ATTRACTION IN HOMOGENEOUS 
AND HETEROGENEOUS GROUPS 


Ratings Choices 
Liking Problem solving Liking Problem solving 
M SD M SD M SD M SD 
Sociometric 1 
Homogeneous 6.7 1.29 6.7 0.93 62.2 29.62 59.0 26.52 
Heterogeneous 6.8 1.11 6.7 1.17 61.8 25.31 49.4 30.20 
Tota! 6.8 1.19 6.7 1.07 62.0 27.22 53.7 28.88 
Sociometric 2 
Homogeneous 7.2 1.09 6.8 1.15 72.8 25.82 65.0 25.83 
Heterogeneous 7.0 1.12 6.8 0.90 61.6 28.99 55.9 26.05 
Tota 7A 1.10 6.8 1.02 66.5 28.04 60.0 26.25 
Sociometric 3 
Homogeneous 7.2 1.05 6.8 1.05 67.4 29.83 59.6 31.56 
Heterogeneous 7.0 1.02 $3 1.00 58.7 32.12 + 55.5 30.06 
Total 7A 1.03 .9 1.02 62.5 31.28 57.3 30.70 


by comparing the taus of the choices of 
heterogeneous group members to the mean 
taus of a random sample of pairs of sub- 
jects. This analysis tested whether the mem- 
bers of heterogeneous groups chose people 
who were more similar to themselves than the 
average pair of subjects in the class were to 
each other. 

The results of this analysis are reported in 
Table 4. In every case the mean tau with 
the choices was less than the mean random 


TABLE 3 


Comparisons or TAUS or OUTGROUP CHOICES WITH 
Taus or UNcHOSEN GROUP MEMBERS FOR 
Members or HOMOGENEOUS GROUPS 


Outgroup Unchosen 
choices EI Dd 
Ne tb 
M M 
tau SD. tau SD 
Liking 
Sociometric 1 | 49 | .069 | .225 | .532 | .124 | 12.19 
Sociometric 2 | 42 | .127 | .240 | .556 | .126 | 9.83 
Sociometric 3 | 45 | .076 | .266 | .548 | .135 | 11.16 
Problem solving 
Sociometric 1 | 50 | .009 | .254 | .555 | .143 | 13.38 
Sociometric 2 | 45 | .028 | .218 | .549 | .125 | 14.23 
Sociometric 3 | 49 | .015 | .259 | .548 | .141 | 12.20 


a Ns represent the number of people who made outgroup 
choices. If a subject made more than one choice, the mean of 
his taus with the other subjects was calculated. 

b All ¢ values for differences between taus of outgroup choices 
and unchosen group members are significant at p <.001. 


tau, and in one case significantly so. Again, 
the evidence is contrary to the similarity- 
attraction hypothesis. 

One final check on the adequacy of this 
experimental test of thé similarity-attraction 
hypothesis was performed. While only 25% 
of the class were female, only one control was 
applied to prevent sex from being a contami- 
nating factor in the original assignment of 
subjects to groups. No woman was assigned 
to a heterogeneous group if the major variable 
in her being different from the male members 
was her Masculinity-Femininity score on 
the GZTS. To check the possibility that 
the similarity-attraction hypothesis might be 
valid for like-sex choices, but not for cross- 
sex choices, the mean tau was computed for 
mutual choices only on each sociometric (i.e., 
where two people chose each other). These 


TABLE 4 


Comparison OF MEAN TAUS OF CHOICES BY MEMBERS 
OF HETEROGENEOUS GROUPS WITH 
Ranpom Taus 


Liking choice Problem-solving choice 


Na | M tau t N |Mtau t 


Random tau | 231 | .067 

Sociometrici| 282 | .016| —1.89 | 278 | .022 | —1.67 
Sociometric2| 270 | —.002 | —2.56* | 267 | .017 | —1.85 
Sociometric3| 279 035 | —1.18 | 277 | .025 | —1.68 


* N represents the number of choices made, which was three 
for each subject. 
*p «.05. 
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TABLE 5 
SIMILARITY (Taus) or MUTUAL-CHOICE PAIRS 
Type of mutual choice 
Male-male Male-female 
ee 
M tau Ne tb M tau N» [3 
Mutual liking choices on: LI 
Sociometric 1 v 149 84 2.22 118 41 1.02 
Sociometric 2 .168 93 2.94 .152 46 1.70* 
*» Sociometric 3 4213 88 4.25% .149 43 1.58 
Mutual problem-solving choices on: 
Sociometric 1 Sfi 67 1.97* 154 21 1.11 
Sociometric 2 .191 87 3.19% .182 26 1.86* 
Sociometric 3 183 73 2.99%" 140 32 1.28 
Mean of random pairs .067 .067 


H 
^ N equals number of mutual-choice pairs. 
b The ¢ tests are one-tailed comparisons 
*p <.05. 

py € 01, 


means are shown in Table 5 and are com- 
pared with the mean tau among the random 
sample of pairs. 

The mean tau for mutual choices was sig- 
nificantly greater than the mean tau for 
random pairs on all three sociometrics for the 
male-male choices and on one for the male- 
female choices, This finding may be ac- 


.counted for almost entirely, however, by 


members of homogeneous groups who chose 


‘members of their own groups. Since the 


homogeneous groups were composed origi- 
nally of people with high taus between their 
profiles, such choices increase the mean sub- 
stantially. When the ingroup choices of mem- 
bers of homogeneous groups are eliminated 
from this analysis, the mean tau of the 
remaining pairs of mutual choices is reduced 
to being insignificantly different from the 
mean tau of random pairs. (Fortunately for 
this subanalysis and consistent with the 
results presented earlier, the proportions 
of mutual choices made by members of 
homogeneous and heterogeneous groups Were 
hardly different.) Thus, this one piece of evi- 
dence in support of the similarity-attraction 
hypothesis is probably due to an artifact of 
the general tendency of subjects to choose 
within their own groups. 

Going back for a moment to Table 1, we 
find that the source which consistently con- 
tributed significant variance was the vari- 


» 
with the mean tau of random pairs. 


able between-individuals-within-groups (and 
types) on all four measures. Despite their 
common experiences and extended interaction, 
the members of each group reacted in idio- 
syncratic ways. While this effect might be 
attributed to response set on the rating mea- 
sures, no such easy explanation can account 
for the results on the choice measures. It 
would appear that each subject used his own 
criteria for rating the other group members 
and in choosing potential members for his 
group. This result held even within the homo- 
geneous groups where one might have ex- 
pected the members to have been more simi- 
lar to each other in their reactions to the 
group experience than in the heterogeneous 
groups. In fact, comparisons of the vari- 
ous estimates of the variance components 
(between groups, Sociometrics X Groups, be- 
tween members, within members) for the 
homogeneous and heterogeneous groups sepa- 
rately, yielded no consistent evidence of dif- 
ferences in results attributable to the variable 
of group type. Neither the between mem- 
bers nor the Sociometric X Groups interaction 
components showed more variation among 
heterogeneous than homogeneous groups, as 
one might have expected a priori. 

One other point should be made with re- 
spect to the changes in attraction over time. 
While significant between sociometric effects 
were found for the liking ratings and the 
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TABLE 6 


CORRELATIONS AMONG MEASURES OF INDIVIDUAL 
ATTRACTION TO THE GROUP 
ACROSS SOCIOMETRICS 


Ratings Choices 
Correlations?" 
between = . Ns 
sociometrics Prob- Prob- 
Liking | lem | Liking | lem 
solving solving 
1and2 


Homogeneous 57 45 .69 51 59 
Heterogeneous | 68 -60 48 70 | .69 


1and3 
Homogeneous 58 4 .68 24 | 52 
Heterogeneous | 69 .54 43 52 48 


2 and 3 
Homogeneous | 58 | .78 | .83 | .32 | .64 
Heterogeneous | 69 59 | 49 | .63 | .68 


~ Note.—All correlations are significantly different from 0 
at the .01 level, except the correlations of liking choice on 
Sociometrics 2 and 3 for the members of homogeneous groups. 

a Ns vary due to missing data on a few subjects on each 
sociometric. 


problem-solving choices, the data in Table 2 
indicate no continuous increase in attraction 
over time. The significance of these two 
measures resulted from a sharp increase from 
Sociometric 1 to Sociometric 2—from the 
fifth to the eighth weeks—followed by a level- 
ing off or a slight decline in attraction at 
Sociometric 3. 

Interestingly, despite the changes noted in 
the absolute level of group attraction, the 
relative level of group attraction appears to 
have been established by the time of the first 
sociometric and remained fairly constant for 
the remainder of the semester. Correlations of 
individual attraction measures among the 
three sociometrics are presented in Table 6. 
Except for the liking choice index,.all correla- 
tions are moderately high and statistically 
significant, The relative attraction of mem- 
bers for their different groups appears to have 
been established fairly well by the fifth week. 


Discussion 


The results comparing the relative attrac- 
tion to their groups of members of homogene- 
ous and heterogeneous groups clearly confirm 
Hoffman’s (1958) earlier finding of no differ- 
ence. On none of the four measures taken were 
the homogeneous groups significantly more at- 
tractive than the heterogeneous groups, nor 
was there any indication of a trend in this di- 
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rection on successive measurements (cf. lack of 
Sociometric X Group Type interaction). Fur- 
thermore, the three checks of alternative 
manifestations of the similarity-attraction 
phenomenon offered further disconfirmation 
of its existence. In this setting, there is no 
evidence that people. with similar personali- 
ties become more attracted to each other than 
do people with-dissimilar personalities. 
Alternatively, the results indicate that the 
superior problem-solving effectiveness of tho 
heterogeneous groups (Hoffman & Maier, 
1961) was not achieved at a cost of the 
members’ attraction for each other. Nor, on 


the other hand, did the objective failure of 1 
the homogeneous groups reduce their attrac- ; 


tion for their members. These points might . 


suggest, then, that the success of the hetero- 
geneous groups compensated for the iack of 
similarity of their members in making the 
groups attractive. The homogeneous group 
members, therefore, should have been more 
attracted to their groups on the basis of 
liking than problem-solving ability, while the 
heterogeneous group members should have 
been more attracted for problem-solving abil- 
ity than for liking. Unfortunately for this 
explanation, the means in Table 2 show no 
such reversals, While objectively the two 
types of groups differed in effectiveness, their 
subjective appraisals of their groups’ ability 


were not different. Diversity of group mem- . 


bership, conducive to effective problem 
solving, need not lead to disharmony, nor is 
uniformity a guarantee of good feeling. 

The results of Hoffman’s earlier study 


failed to shake the confidence of certain re- 


viewers in the propostion that similarity pro- 
duces attraction (e.g., Hare, 1962, p. 141). 
True, Hare modifies the general hypothesis 
to suggest only that “friends [may be] simi- 
lar on some but not all of their personality 
characteristics," We had no way of testing 
this proposition with the present data, since 
the “modal personalities? of the different 
homogeneous groups were quite varied. If the 
problem-solving effectiveness of the two types 
of groups had not differed so consistently, 
we might have been persuaded that the 
method of determining personality similarity 


— 


was not valid (an argument, of course, that | 


can still be made). However, the present 
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results, having replicated the earlier findings, 
suggest to us that situational factors proba- 
» bly place limits on, and occasionally com- 
pletely overwhelm, the significance of person- 
—ality similarity in determining interpersonal 
attraction. 
Consideration of the. differences between 
= the present study and those in which similar- 
ity and attraction were positively associated 
^ suggest at least two»important situational 
= factors. In the first place, the present groups 
. were formed on the basis of similarity and 
° were required to remain together and to inter- 
act intensively in the classroom. In most of 
* the other studies, the groups were formed on 
+. other bases and the members were more able 
to regulate the extent ef their relations with. 
others or to associate more with certain 
membets than with others (e.g., Rosenfeld & 
` Jackson, 1959; Shapiro, 1953). In unstruc- 
tured social situations people with similar 
personalities may become attracted to each 
Other as they find their attitudes, interests, 
and reactions to shared experiences to be 
compatible. The attractiveness of a group 
z whose membership 4s prescribed by the or- 
ganization may, on the other hand, be a 
function of factors more directly related to 
. the goals of the group. For example, Metzner 
and Mann (1953) found a negative correla- 
4 ‘tion between the percentage of workers who 
felt their groups were good at “sticking 
' together to get what they wanted” and the 
groups’ absence rate, a behavioral manifesta- 
tion of low attraction. 
_ The possible relevance of the group’s pur- 
pose for the members’ attraction points up a 
second way in which the present study dif- 
fers from the others. Most of the positive 
correlations between similarity and attrac- 
tion have been found in groups which were 
Principally social in nature, such as sorority 
members (Shapiro, 1953) rooming-house resi- 
dents (Newcomb, 1961), or work associates 
- who socialized together outside but were not 
necessarily members of the same work group 
(Rosenfeld & Jackson, 1959). In all of these 
Situations the principal demand placed on the 
relationship is one merely of getting along 
With one another and, if possible, enjoying 
the relationship. Under such circumstances, 
Similarity of personality could provide the 


N 
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basic ingredients for enjoying the same kinds 
of activities, sharing positive feelings about 
pleasant events, and even offering sympathy 
when things go wrong. 

The groups in the present study, on the 
other hand, were faced weekly with the task 
of interacting to provide themselves a learn- 
ing experience, which usually involved the 
resolution of a conflict built into a role- 
playing situation. More important probably 
than their ability to get along and surely to 
like each other, was the challenge of meeting 
each situation successfully. While no direct 
feedback was given to the groups regarding 
their success on each problem, the class dis- 
cussion following each problem gave them 
some insight into the adequacy of their per- 
formance. Perhaps the members were at- 
tracted to their groups as a function of their 
perceptions of the group's success. 

The study was not designed to test this 
proposition, since the groups’ problem-solving 
performance and members’ satisfaction with 
the solutions were measured on only a sample 
of problems. Furthermore, the satisfaction 
ratings were taken before the class discussion 
which may have changed the members’ per- 
ceptions of their groups’ success. Some modest 
positive correlations between the members’ 
satisfaction with certain of their groups’ 
solutions and their attraction to their groups 
suggest that perceived success or failure may 
be one determinant of attraction in problem- 
solving groups. Unfortunately, data were not 
collected systematically, after the discussion 
of the problem in each laboratory section, 
on the members’ perceptions of their groups’ 
success or failure, It might be hypothesized, 
however, that the variables determining group 
attraction will be those most closely associ- 
ated with the members’ perceived function for 
the group, congeniality in the case of social 
groups, task effectiveness for work groups. 
Such a definition may help to specify New- 
comb’s (1961) contention that interpersonal 
attraction is a function of similarity of atti- 
tudes concerning objects of high importance. 
Objects become important for group members 
if they are relevant to their definition of the 
group’s task. 

Another important consideration raised by 
the results of the present study is the effect 
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on attraction of continued interaction. Slater 
(1955) has suggested that the attraction of 
members for their groups continues to in- 
crease with continued interaction. In his 
study the groups met for only four meetings, 
once a week over a 4-week period. The pres- 
ent study suggests that while attraction may 
increase for a certain period, it reaches a 
ceiling and may even decline if the group 
continues to meet. In light of the hypothe- 
sized relationship between group success and 
attraction, it is conceivable that attraction 
may even decline if further meetings fail to 
accomplish the group's objectives. Certainly, 
experience tells us that certain groups dis- 
integrate when internal or external stresses 
become unmanageable. On the other hand, 
interaction and continued success would seem 
to be conducive to increasing group attraction. 
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Physical aggression was studied in relation to 5 variables. There were 2 
intensities of frustration plus a control group. The aggression (delivery of 
` electric shock) was either of instrumental value in overcoming the frustration 
or of no instrumental value. The victim either gave feedback (moans, groans) 
or did not. Ss were either men or women; the victims were either men or 
women. Frustration did not lead to more aggression than a control. When 
aggression was of instrumental value, it was more intense than when it was 
of no value. Feedback lowered aggression intensity. Men aggressed more in- 
tensely than women, and male victims received more intense aggression than 
women, This frustration was the only variable that did not affect aggression, 
a fact which bears on both the definition of frustration and the frustration- 


" aggression relationship. 


There have been hundreds of studies of 
aggression (see Berkowitz, 1962; Buss, 1961) 
but only a few in which a subject aggresses 


‘physically against another person. These are 


experiments by Berkowitz and his students 
(Berkowitz, 1962), Hokanson and colleagues 
(eg., Hokanson, Burgess, & Cohen, 1963), 
Milgram (1963), Walters and his students 


.(eg., Walters & Thomas, 1963), and the 


author (e.g., Buss, 1963). 
Investigations of the relationship between 


» pure frustration and actual aggression (as 


opposed to projective or inventory measures) 
are even more rare. Of course this statement 
depends upon one's definition of frustration. 
Berkowitz (1962) adopts an inclusive posi- 
tion, defining frustration as: 


the interruption of an internal response sequence or 
the blocking of some drive. Similarly, a person who 
steps on our toes might also arouse anger if this 
action interrupted or interfered with the internal 
response oriented toward the prestrvation or at- 
tainment of security and comfort [p. 30]. 


This definition is extremely general, for it is 
difficult to imagine a behavior sequence in 
Which some drive is not suffering interference, 


1The writer thanks Edith Buss, Robert Levine, 
Peggy Martin, Fern Portnoy, Arnold Rintzler, Mil- 
ton Rolle, Wade Silverman, and Rae Yanosky for 
Participating in the experiment, and Merle Mosko- 
Witz for help with the manuscript. The research was 
supported by Grant MH-0577-01A1 from the United 
States Public Health Service. 


especially when security and comfort are in- 
cluded as drives. In this context frustration 
is ever-present and is therefore an antecedent 
not only of aggression but of virtually all 
behavior. Thus the inclusive definition appears 
to be so broad as to lose all rigor, a risk that 
Berkowitz himself has admitted (p. 30). 

The author’s definition of frustration is nar- 
rower: the blocking of ongoing instrumental 
behavior that has in the past led to a rein- 
forcer. This conception is sufficiently broad to 
encompass a variety of interfering stimuli 
without including virtually all the stimulus 
situations that confront the organism. 

The difference in approaches to frustration 
is not merely academic; it has research im- 
plications, Berkowitz subsumes attack under 
the heading of frustration, so that a blocking 
operation and insult by the experimenter are 
both labeled frustration. The author, on the 
other hand, distinguishes between attack and 
frustration. Attack is assumed to be a potent 
antecedent of aggression, whereas frustration 
is a weak antecedent of aggression. Thus 
when an experimenter blocks ongoing be- 
havior and insults the subject, it is the verbal 
attack that presumably elicits intense aggres- 
sion, not the frustration. The writer contends 
that the inclusive definition of frustration has 
blurred important differences between attack 
(delivery of noxious stimuli) and pure frus- 
tration (blocking of ongoing behavior). Ver- 
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bal attack usually elicits anger and, under 
certain conditions, aggression from a subject, 
whereas frustration is unlikely to do so. This 
success in eliciting aggression may account 
for the extensive use of verbal attack in ex- 
periments labeled frustration, but such use 
should not mislead anyone into lumping at- 
tack with frustration. 

Thus there appear to be two reasons for the 
scarcity of experiments relating pure frus- 
tration to physical aggression: frustration has 
been defined so as to include attack, and this 
has led to the confounding of attack with 
frustration in experimental manipulations; 
and only a few researchers have attempted to 
study physical aggression in the laboratory. 

The position taken here is that früstration 
is not a potent determiner of aggression and 
leads to only mild aggression. This hypothesis 
was supported in a previous study (Buss, 
1963), which employed three different frus- 
trations and a control condition. Differences 
in the strength of frustration did not yield 
significant differences in the intensity of ag- 
gression, and all three frustrations led to only 
minimal (although significant) aggression in 
comparison to the control condition. 

In explaining these findings, it was sug- 
gested that whether or not aggression is in- 
strumentally valuable determines the frus- 
tration-aggression relationship. Aggression was 
noninstrumental in that it did not (and could 
not) overcome the frustration. When aggres- 
sion is of no value in coping with the inter- 
ference, it can serve only to vent the anger of 
the frustrated person. In these circumstances 
frustration should lead to only minimal ag- 
gression, However, when aggression is of value 
in coping with the interference (i.e., over- 
comes whatever is blocking the ongoing be- 
havior), it is reinforced not only by venting 
of anger but also by achieving the goal of the 
ongoing instrumental behavior. Thus in the 
previous study the fact that aggression had 
no instrumental value might have minimized 
the aggression-arousing effects of frustration. 

Another variable might also have held down 
the intensity of aggression: feedback from 
the victim. The mode of aggression was the 
delivery of electric shock, and in order to 
maintain realism, the victim gasped, groaned, 
or cried out whenever the shock level exceeded 
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a given intensity. This feedback from the 
victim presumably had the effect of lowering 
aggression intensity (this has been confirmed 
by subsequent unpublished research). 

In brief, there are two variables that might 
have minimized the aggression-arousing prop- 
erty of pure frustration: the lack of instru- 
mental value of the aggression and feedback 
from the victim. The present study attempted 
to investigate these Variables by repeating 
the above two conditions and adding two 
others: instrumentally valuable aggression 
and no feedback from the victim. Frustration 
should lead to the most intense aggression, 
when the aggression has instrumental value 
and/or there is no feedback from the victim. 
Frustration should lead to the least intense 
aggression when aggression has no instru- 
mental value and/or there is feedback from. 
the victim. 


METHOD | 


Measurement of Aggression 


Since the apparatus 'has been described in detail 
elsewhere (Buss, 1961, 1963), the present account 
will be brief. The subject, together with another 
“subject” (actually a confederate, hereafter called 
the victim) is told that the two of them will par- 
ticipate together in a learning experiment. A rigged 
lottery procedure assures that the real subject will 
be the “experimenter” and the victim, the “subject.” - 

The real subject is then taken to the experimental 
room and shown how to present stimuli and record 
responses. He is instructed to flash a “correct” light’ 
after correct responses and to shock the victim = 
after incorrect responses, There are 10 shock but-- 
tons, and the subject is given shock from Buttons. 
1, 2, 3, and § in order to know how much punish- 
ment he will subsequently deliver. Button 1 gives à. 
very mild shock, which increases in intensity with 
each succeeding button until at Button 5 it is 
noxious. The subject is told that the intensity con- 
tinues to mount throughout Buttons 6-10, and by 
such extrapolation Button 10 would be excruciat- 
ingly painful. 

Then the victim is brought in and left with the 
subject, who places the electrode on the victim's 
finger. The victim reads the brief instructions and 
then the stimuli are presented. The subject had 
been told to shock after each of the first 10 trials 
this being “a Confusion Series designed to wipe out 
any pre-experimental response tendencies." 
confusion series yields a base level of aggression 
(shock) intensity. É 

Once the victim and the subject are seated, their 
vision of each other is blocked by the apparatus 
The victim opens his "college notebook," which 
contains a data sheet for recording shock levels and 
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a programed series of responses that is the same for 
all subjects: six errors on Trials 11-20, five errors on 
Trials 21-30, and four errors per 10 trials thereafter 
until he meets the learning criterion of five consecu- 
tive correct responses. The victim silently opens a 


— side section of the apparatus, simultaneously shut- 


ting off the shock and exposing a “Nixie tube" that 
reveals which shock button is being used. 


Frustration Procedure 


H 

The subject's goal is to teach the victim a simple 
concept, and the frustration consisted of the vic- 
"üm's inability to learn. One way of varying the 
strength of frustration is to vary the strength of 
the motivation being blocked (Dollard, Doob, Mil- 
ler, Mowrer, & Sears, 1939). Accordingly, the in- 


. centive for the subject was either proving his own 


ability to obtain learning (know-how) or perhaps 
obtaining a better course grade (grades). Thus there 
were two frustration conditions plus a control. 

. Know-how. The subject was told that previous 
research, showed that the know-how and general 
ability of the “experimenter” was a major de- 
terminer of learning and that more capable experi- 
menters were getting faster learning. He was told 
that the victim should reach the learning criterion by 
approximately Trial 30, which was when most 
victims reached criterion. 

Grades. The subject was*told that the introduc- 
tory psychology professor was interested in the re- 
sults, which would be furned over to him. He would 
use them to help determine borderline grades. The 
subjects were told of this deception upon completing 
the experiment, and no subject was dismissed until 
he was at ease concerning his grades. Otherwise the 
instructions were the same as for the know-how 
group. 

Control. The subject was told nothing about the 
determiners of learning, and, unlike the frustration 
subjects, he was given no indication of when the 
victim might learn. In both the control and the two 
frustration conditions the victim reached criterion 
starting at Trial 70.2 


Instrumental Value of Aggression 


One method of making aggression instrumentally 
valuable would be to have the subject learn that 
whenever he gives more intense shock, the victim 
learns faster. The victim would differentially rein- 
Íorce more intense aggression by making correct 
responses only if the shock level were elevated and 
by making incorrect responses if the shock level 
remained low. Unfortunately, the subject would be 


? This brief exposition of the frustration proce- 
dure may be supplemented by the more detailed 
account of the previous study (Buss, 1963). The 
Money-frustration group, employed previously, was 
not used because it appeared to be of approximately 
the same intensity of frustration as the know-how 
group and because the incentive value of the small 
monetary reward might covary (negatively) with 
Personal wealth. 
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learning that aggression is of value at the same time 
as he is being frustrated, thus confounding the two 
variables.3 

The only way to avoid such confounding is to 
establish the instrumental value of aggression at the 
start of the experiment. This was accomplished by 
instructions. The subject was told that research had 
established that the more intense the shocks, the 
faster the learning, that is, that aggression would be 
instrumentally valuable in overcoming frustration 
(the victim’s inability to learn). This instruction was 
given to half the subjects, for whom aggression 
thereby would offer a means of overcoming the 
blocking of the path to their goal. It was not given 
to half the subjects, for whom aggression had no 
specific instrumental value. 


Feedback 

Half the subjects were given feedback. Whenever 
the shoek level reached Buttons 8, 9, or 10, the vic- 
tim gasped, groaned, or cried out in pain, In order 
to maintain realism, this feedback was given only 
half the time when Buttons 6 or 7 were used, and 
there was no feedback below Button 6. No feedback 
was given during the confusion series. The other half 
of the subjects were given no feedback at all; the 
victim remained completely silent. 


Subjects and Design 


The subjects were recruited from introductory psy- 
chology classes. Half were men and half were 
women; half had a male victim, and half had a 
female victim. 

In terms of the experimental design, there were 
five independent variables: frustration, instrumental 
value, feedback, sex of subject, and sex of victim. 
This yielded a 3X2X2X2X2 design, with five 
subjects per cell, making a total of 240 subjects. 


RESULTS 


The design of this experiment was com- 
plex and the data plentiful; the following 
steps were taken in order to keep the exposi- 
tion orderly. The confusion-series data are 
presented separately from the learning-series 
data. With the exception of the frustration 
condition, the only results presented in graphic 
or tabular form are those that achieved sig- 
nificance in the analyses of variance. The con- 
ditions in which aggression was either instru- 
mentally valuable or not instrumentally valu- 
able are called learning hypothesis and no 
learning hypothesis, respectively. 


8The use of the victim's correct responses as 
reinforcement for aggression is itself of some in- 
terest, and it is being investigated in the absence of 
frustration. 
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Confusion Series 


The analysis of variance of the confusion- 
series data is shown in Table 1. The feed- 
back variable was omitted from the analysis 
because no feedback was given during the con- 
fusion series. The learning hypothesis F is 
highly significant, and inspection of the ap- 
propriate means reveals that the learning- 
hypothesis group gave reliably more intense 
shock than the no-learning-hypothesis group. 
This is of course the expected result, and it 
merely confirms the success of the instruc- 
tions. Telling subjects that learning is facili- 
tated by more intense shock should lead to a 
higher shock level, as well as render the ag- 
gression instrumental in overcoming frustra- 
tion. 

The Frustration X Learning Hypothesis 
interaction is significant, and Table 2 presents 
the appropriate means. It can be seen that 
when aggression has no instrumental value (no 
learning hypothesis), there are essentially no 
differences between the control know-how, 
and grades groups. These three groups pre- 
sumably differ in their motivation to teach the 
victim. The control subjects no doubt wanted 
the victim to learn because that was the os- 
tensible goal of the experiment. The know- 
how subjects had the additional motivation 
of proving their own “general ability" by hav- 


TABLE 1 


ANALYSIS OF VARIANCE OF CONFUSION 
Serres DATA 


E 
a 
& 
^* 


Source 


Frustration (A) 
Learning hypothesis (B) 
Sex of subject (C) 

Sex ce victim (D) 


> 
xx 
a 
Lx 
o 
© 
5 
ARNENRNEEENNNEEEN 


XXXXXXXXX 
OOoOoSuWggodg 


ERRU»R»Ou*W» 


8 
g 
m 
L 
9e 
o 
n] 
E 


Arnotp H. Buss 


TABLE 2 


MEAN SHOCK LEVEL DURING THE CONFUSION SERIES 
OF THE FRUSTRATION GROUPS WITH AND 
WITHOUT THE LEARNING HYPOTHESIS 


No learning Learning 


hypothesis hypothesis 
Control 3.0 3.1 
Know-how 3.0 3.5 
Grades » 2T 4.2 


ing the victim achieve fast learning. The 
grades subjects had the additional incentive 
of perhaps bettering their grade (and also the 
risk of worsening it). Thus there were a priori 
reasons for expecting variations in motivation 
among the three groups. 

With aggression of no instrumental value 
(no learning hypothesis), these differences in 
motivation are not reflected in differences. in 
aggression intensity, and the three groups are 
similar (see Table 2). However, when ag- 
gression has instrumental value in achieving 
the goal, the differences in motivation are 
clearly reflected in aggression intensity. 
Grades subjects aggressed more than know- 
how subjects, who aggressed more than con- 
trol subjects. These differences (which are 
reliable, as indicated by the significant inter- . 
action F) confirm the a priori ordering of the 
three groups in terms of motivation. 

Thus when aggression is instrumentally 
valuable, the motivational differences between 
the three groups become manifest. These 
differences establish the success of the frus- 
tration manipulations: introducing motiva- 
tions of varying strength, which subsequently 
suffer interference (the victim does not learn). 
The stronger the motivation being blocked; 
the greater the frustration and ostensibly the 
greater the frustration, the more intense the 
aggression. 

The only other signifücant Fs in Table ! 
are for sex of subject and sex of victim. Men 
aggress more intensely than women, and male 
victims receive more intense aggression than 
female victims. These findings repeat the sex 
differences of the previous study (Buss, 1963), 
and they are consistent with other research 
and observations made outside the labora- 
tory. 


4 


" 
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Learning Series 


Frustration. One of the major goals of the 
experiment was to determine the intensity of 
aggression elicited by frustration. The first 


' question to be answered is: What was the 


e 


overall effect of frustration regardless of the 
other independent variables? The relevant 
data are presented in Figure 1. In this graph 
and all subsequent ones the first two blocks of 
shock trials represent the confusion series, 
"and Blocks 3-7 represent the learning series. 
Note that the shock trials are not the same 
as the learning trials since shock was deliv- 


. ered only after incorrect responses. The know- 


how and grades groups had been told that 
learning should occur by Trial 30, which cor- 
responds approximately to Shock Trial 20. 
Any effects of frustration should become evi- 
dent during the fourth block of shock trials, 
when it became apparent to the subject that 
the victim would neither learn faster than the 
norm nor even as well as the norm. 
Examination of Figure 1 reveals that the 
three curves start off'at slightly different 
heights (the effects, of varied motivation, as 
was shown earlier) and are essentially parallel 
throughout. An analysis of variance yielded 
nonsignificant Fs for both height (main effect) 


* and slope (Frustration X Trials) of the 
. curves, Thus frustration did not lead to 


aggression, that is, led to no more aggression 
than a control. 
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Both the instrumental value of aggression 
and feedback from the victim were thought to 
be determiners of the frustration-aggression 
relationship, and the maximal effect of frus- 
tration was expected to occur when aggres- 
sion was instrumentally valuable and feed- 
back was absent. The relevant data are pre- 
sented in Figure 2. Here the slopes of the 
curves are not quite parallel; the Grades 
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TABLE 3 
ANALYSIS OF VARIANCE OF ALL DATA 
Source MS df F Source MS df F 
Feedback (A) j 2446.0 1 T.9** BXF 88.3 6| 51m 
Learning hypothesis (B) | 9653.0 1| 28.66 | CXF 23.3 6 14 
Sex of subject (C) 3688.0 1| 1099* | DXF 33.8 6] 20 
Sex of victim (D) 3861.0 1| 11.5** | EXF 8.1 12 
Frustration (E) 620.5 2 1.8 AXBXF 19.3 6 1.4 
AXB 1837.0 1 5.5* AXCXF 25.2 6 1.5 
AXC 29.0 1 AXDXF 7.2 6 
AXD 62.0 1 AXEXF 17.9 12 1.0 
AXE 1226.0 2 3.6* BXCXF 24.8 6 14 « 
BXC 33.0 1 BXDXF 7.0 6 
BXD 512.0 1 1.5 BXEXF 29.8 12 in 
BXE 913.0 2 2.7 CXDXF 7.0 6 
CXD 1.0 1 CXEXF 15.7 12 
CXE 204.0 2 DXEXF 17.0 12 
DXE 155.0 2 AXBXCXF 14.7 6 
AXBXC 39.0 1 AXBXDXF 13.0 6 
AXBXD 205.0 CK AXBXEXF 22.0 12 13 
AXBXE 76.0 2 AXCXEXF 16.3 12 
AXCXD 17.0 1 AXCXDXF 8.0 6 
AXCXE 187.0 2 AXDXEXF 11.0 12 |» 
AXDXE 359.0 2) 11 BXCXDXF 7.0 6 
BXCXD 23.0 1 BXCXEXF 10.3 12 
BXCXE 332.0 2 BXDXEXF 15.6 12 
BXDXE 19.0 2 CXDXEXF 12.3 12 
CXDXE 13.0 2 AXBXDXCXF 13.3 6 
AXBXCXD 1077.0 1 3.2 AXBXCXEXF 28.3 12 1.7 
AXBXCXE 378.0 2 1.1 AXBXDXEXF 6.6 12 
AXBXDXE 665.0 2 2.0 BXCXDXEXF 36.5 12 24 
BXCXDXE 336.0 2 AXCXDXEXF 18.6 12 14 
AXCXDXE 21.0 2 AXBXCXDXEXF 50.1 12 29 
AXBXCXDXE 693.0 2 24 Error (within) 17.2 | 1106 
Error (between) 337.0 239 
"Trials (F) 2814.0 6 | 163.6% 
AXF 154.2 6| 90 
* $ 2.05, 
i 
hp e. 001. 


curve is slightly steeper than the other two. 
However, an analysis of variance yielded a 
nonsignificant Frustration X Trials F of 1.3, 
indicating no reliable difference in slopes. 
"Thus even under the most optimal conditions, 
frustration did not lead to aggression. 
Instrumental value of aggression and. feed- 
back from the victim. It was expected that 
making aggression instrumentally valuable 
would elevate its intensity. Earlier it was 
shown that the learning hypothesis resulted in 
more intense aggression during the confusion 
series; the relevant data for the learning series 
are presented in Figure 3. The learning- 
hypothesis curve starts higher and has a 
slightly steeper slope than the no-learning- 
hypothesis curve. The significance of these 
trends was evaluated in an analysis of vari- 
ance of the data of the entire experiment, 


shown in Table 3. The figures in this table 
indicate that the learning-hypothesis curve is 
both significantly higher (main effect F of 
28.6, p= .001) and steeper (Learning Hy- 
pothesis X Trials F = 5.1, p =.001) than the 
no-learning-hypothesis curve. When aggres 
sion is of service in helping the subject t0 
reach his goab (in this instance, teaching ati- 
other person a concept), aggression of higher 
intensity tends to be employed. 4 

When the victim cries out in pain, it i$ 
reasonable to assume that the aggressor Wl 
lower the intensity of aggression. The rele: 
vant data are shown in Figure 4. The feed- 
back and no-feedback curves are close at the 
start but diverge after the fourth block of 
shock trials. The analysis of variance in T: able 
3 shows that the no-feedback curve is both 
significantly higher (feedback F = 7.3, 2^ 
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.01) and steeper (Feedback X Trials F = 
9.0, p = .001) than the feedback curve. Thus 
the victim’s pain vocalizations tend to reduce 
the intensity of the aggression he is receiving. 
Sex differences. There were sex differences 
in the confusion series, and they carried 
through the learning series. The analysis of 
variance (Table 3) yielded significant differ- 
ences for both sex of subject (F = 10.9, p= 
* 001) and sex of victim (F = 11.5, p= 001). 
. Thus men aggressed more intensely than 
women, and men were aggressed against more 
intensely than women. 


Discussion 


. Tt was assumed prior to the experiment 
that the instrumental value of aggression and 
feedback from the victim are determiners of 
the intensity of aggression. The results un- 
equivocally support this assumption. Aggres- 
sion that helps the aggressor te reach his goal 
is significantly more intense than aggression 
that has no instrumental value, and pain 
vocalization from the victim significantly 
reduces the intensity of aggression, In the 
previous study (Buss, 1963) the aggression 
had no instrumental value, and there was 
feedback from the victim. These conditions 
would reduce the intensity of aggression, thus 
explaining why frustration led to only mini- 
mal aggression. This explanation is supported 
by the demonstrated effects of instrumental 
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aggression in intensifying aggression and of 
feedback in reducing aggression intensity. 

Tf this interpretation is correct, frustration 
should lead to intense aggression when aggres- 
sion is instrumentally valuable and there is no 
feedback, However, this did not occur; frus- 
tration under these conditions led to no more 
aggression than a control (see Figure 3 and 
Table 3). In fact, in the present study frus- 
tration simply did not lead to aggression. 
Since this finding runs counter to the beliefs 
and theories of many ‘psychologists (e.g., 
Berkowitz, 1962; Dollard et al., 1939), it is 
necessary to examine the experimental opera- 
tions to determine whether they might ac- 
count for the results. There are two main pos- 
sibilities: the frustration procedure and the 
measurement of aggression. 

The frustration procedure was in accord 
with the definition of the term: a motivation 
was established for the subject to reach a goal, 
and then his efforts to reach the goal were 
blocked. The goal was to teach a concept and 
obtain learning as fast or faster than a ficti- 
tious norm. Reaching this goal would os- 
tensibly be motivated by self-esteem (know- 
how) or the desire for a good course grade 
(grades). If the subject were not motivated, 
there could be no frustration, Therefore it is 
necessary to establish the success of the moti- 
vating instructions. 

In assessing the subject's motivation, ag- 
gression intensity prior to frustration offers 
an excellent measure. If aggression is instru- 
mental in helping the subject to achieve his 
goal, then more motivated subjects should 
aggress more intensely than less motivated 
subjects. This is precisely what occurred in 
the confusion series (prior to frustration). 
When aggression was instrumentally valuable 
(learning hypothesis), grades subjects ag- 
gressed most intensely, know-how subjects 
next, and control subjects least intensely. 
These results indicate the success of the moti- 
vating instructions, establishing this increas- 
ing order of motivational strength: control, 
know-how, grades. 

The second aspect of the frustration pro- 
cedure was to block the subject’s efforts to 
reach the goal (successfully teaching a con- 
cept). The blocking consisted of the victim’s 
inability to learn, and there is no need to 
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establish the success of this operation, It suf- 
fices merely to denote its occurrence: the vic- 
tim, by not meeting the learning criterion, 
prevented the subject from achieving the goal 
of successful teaching of a concept. 

It might be argued that the control group 
should not be considered a no-frustration 
group because there really was frustration, 
the reasoning being as follows. The control 
subjects had as their goal the teaching of a 
concept. This motivation was blocked by the 
victim's failure to learn, and this frustration 
led to increments in aggression during the 
learning series. 

This argument can be refuted both em- 
pirically and logically. The motivation of con- 
trol subjects to teach a concept was slight, as 
shown by the data in Table 2. Control sub- 
jects given the learning hypothesis had essen- 
tially the same level of aggression intensity 
during the confusion series as control subjects 
not given the learning hypothesis. However, 
the know-how and grades subjects, who had 
been given clear motivation to obtain learn- 
ing, showed increments in aggression inten- 
sity when they were given the learning hy- 
pothesis (see Table 2). 

The motivation of the control subjects to 
obtain learning was not only minimal, but it 
was not directed toward fast learning. They 
were not told how long it would take for the 
victim to learn, and their data sheets allowed 
for 80 trials. Some subjects may have antici- 
pated fast learning and others, slow learning. 
The victim's slowness in learning would frus- 
trate only those subjects who expected very 
fast learning. In the frustration groups all 
subjects expected fast learning because they 
were given fictitious norms, but in the control 
group, in the absence of such norms, most 
subjects would not anticipate fast learning. 

Thus the motivation of the control group 
as a whole was to obtain learning, not fast 
learning. The victim did learn eventually, thus 
satisfying the subject's motivation. In the 
absence of motivation to obtain fast learning, 
the victim’s slowness cannot be considered a 
frustration. 

Why, then, did control subjects elevate ag- 
gression intensity? The learning hypothesis- 
control subjects raised shock intensity be- 
cause they had been told that higher shock 
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yields faster learning. This leaves the results 
of the no learning hypothesis-controls to be 


explained. Their aggression intensity rose ap- - 


proximately a point. This upward drift oc- 


curred in both this control group, and a simi- ` 


lar one used in the previous study (Buss, 
1963). In the previous study the control sub- 
jects were told to expect learning after 65 
learning trials, which guaranteed that the 


subjects would not expéct fast learning; their — 
aggression intensity rose approximately half 


a point. In earlier pilot research similar groups 
elevated shock intensity slightly more than a 
point. 

The explanation of this upward drift, which 
also occurs when subjects are instructed to 


aggress (Brock & Buss, 1962; Buss & Brock, | 


1963), is speculative but straightforward. 


There are 10 shock buttons; most subjects | 


start near the bottom end and remain in the 
1-5 range. As the learning series progresses, 
subjects tend to try out different shock levels. 
Since the subjects are initially at the low end, 
increased variability must yield an upward 
drift in mean shock levels. Note that this 
drift occurs whether subjects are told that 
learning will be slow (Buss, 1963) or not 
(present experiment). Questioning of sub- 
jects in previous research revealed no evi- 
dence of anger or annoyance with the victim. 
In brief, the upward drift can be explained 
without invoking frustration, for which there 
is no evidence. Thus it seems safe to conclude 
that the control group was indeed just that, 
that is, there was no frustration. 

Thus the frustration operations were suc- 
cessful. Subjects were motivated to reach & 
goal, this motivation being reflected in more 
intense aggression. Subsequently, the sub- 
jects were prevented from attaining the goal, 
but this frustration did not intensify the ag 
gression. Since the frustration procedure was 
adequate, perhaps the problem is with the 
measurement of aggression. 

It might be argued that the subject was not 
really aggressing because he was instructed t0 
deliver shock for incorrect responses. How- 
ever, while the subject had to use shock, he 
could have used the lower intensities. Shock 
from Buttons 1 or 2 is sufficiently mild to be 
a signal (that the response was wrong) with- 
out being aggressive; but since Buttons 3 an 
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higher deliver clearly painful shocks, the 


subject is not merely signaling an incorrect 


w' response but delivering noxious stimuli as 


á 


well. This fits the generally accepted defini- 
‘tion of aggression: the delivery of noxious 
stimuli to- another organism. 

It might also be argued that the physical 
aggression employed here was so noxious that 
subjects hesitated to use it and hence frus- 
tration did not lead te aggression. However, 
this argument does not bear close inspection. 
Frustration did lead to aggression but no 


” more so than no frustration. As may be seen 


n 


in Figure 1, all three groups increased the 
‘intensity of shocks as the trials progressed, 
attaining final averages between Buttons 4 
and 5, which gave rather annoying shocks. 
Thus the subjects did not hesitate to use 
noxioug stimuli during the learning series. 


` Moreover, frustration was the only independ- 


ent variable that did not influence aggression 
intensity. The other four—instrumental value, 
feedback, and sex of subject and victim—all 
significantly affected the.intensity of aggres- 
sion, which suggests that this response meas- 
ure is not intrinsicalty difficult to manipulate. 

In brief, both the frustration operations 
and the measure of aggression appear to be 
„adequate, which means that the failure of 
frustration to elicit aggression (compared to a 


. control) cannot be attributed to weaknesses 


* 


f 


' or artifacts of the experimental procedure. It 
' may be concluded that frustration leads to 
no physical aggression even when the ag- 
gression is instrumental. Integrating this 
finding with those of the previous study, it 
ig clear that for college students pure frus- 
tration is a relatively unimportant antecedent 
of physical aggression. These results empha- 
size the importance of a clear definition of 
frustration and an unconfounded frustration 
procedure, As was noted earlier, some investi- 
gators have combined frustration with attack 
and shown that the combination elicits ag- 
Bression. Since it has been demonstrated in 
two large-scale experiments, that frustration 
is at best a weak determiner of aggression, it 
seems likely that it was the attack aspects of 
the frustration-attack combination that elicit 
aggression in these other studies. Thus the 
Present results appear to be consistent with 
the view expressed previously (Buss, 1961, 
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1963) that there is little relationship between 
frustration and aggression. The author knows 
of no research contradicting this view, that is, 
research in which pure frustration has led to 
attacking behavior. 

In attempting to explain the lack of a 
frustration-aggression relationship, one possi- 
bility is the middle-class status of the sub- 
jects, college students. Physical aggression is 
not generally accepted in middle-class homes, 
and by the time maturity is attained, the pre- 
ferred mode is verbal aggression. The frus- 
trator may be subjected to verbal abuse, 
curses, threats, or rejection, but he is rarely 
attacked physically. Of course these comments 
apply to the middle class, and subjects of 
lower-class status might be more likely to 
aggress physically in the face of frustration. 
This is a matter for subsequent research. 

If frustration is not an important deter- 
miner of aggression in these subjects, what is? 
The answer lies in the remaining independent 
variables, all of which significantly affected 
aggression. 

The instrumental value of aggression has 
been emphasized in earlier formulations 
(Buss, 1961, 1963), in which it was pointed 
out that aggression may be reinforced by any 
of the usual rewards: food, water, sex, money, 
approval, dominance, and escape from aver- 
sive stimuli, Zmstrumental aggression, which 
leads to such rewards was distinguished from 
angry aggression, which is usually reinforced 
by the victim's pain or discomfort. In the 
present study angry aggression might have 
been elicited by frustration, but it did not 
occur; frustration led to no more aggression 
than did a control. What did occur was instru- 
mental aggression, When the aggression was 
of service in attaining a goal, it was more 
intense than when it had no value. Thus re- 
warding aggression or showing that it can 
lead to the goal (as was done by instructions 
here) is a potent determiner of aggression in- 
tensity. 

The effect of feedback from the victim was 
also significant but in the opposite direction, 
The victim's pain vocalizations caused a drop 
in the intensity of aggression, Considering 
the way the feedback occurred, it must be a 
potent variable. The victim vocalized pain 
every time Buttons 8, 9, or 10 were used, but 
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these buttons were used infrequently. When 
subjects went beyond Button 5, it was usually 
only to Buttons 6 and 7, which elicited feed- 
back only half the time. Hence when the sub- 
ject used the more intense pain levels, he 
usually received the feedback only half the 
time. Yet the pain vocalizations were effective 
in reducing aggression intensity, which means 
that feedback appears to be a powerful de- 
terminer of aggression intensity. 

Sex of subject and victim were also signifi- 
cant variables. As in the previous study, men 
aggressed more and were aggressed against 
more than women. There was one discrepancy. 
In the previous study women aggressed about 
equally against male and female victims, 
whereas in the present study women aggressed 
more against men than against women. There 
is no obvious reason for this discrepancy, and 
for the present it seems best to draw no con- 
clusions about whether women aggress differ- 
entially against male and female victims. 
What can be concluded is that the sex of both 
subject and victim must be taken into account 
in making predictions about aggression; both 
influence aggression intensity. 


Arnotp H. Buss 
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EFFECTS OF VERBAL REINFORCEMENT COMBINA- 


TION AND INSTRUCTIONAL CONDITION ON PER- 
FORMANCE OF A PROBLEM-SOLVING TASK 
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Performance on a "sentence-construction task was studied using a 3X3 
factorial design which varied reinforcement combination (right-blank, wrong- 
blank, and right-wrong) and instructions (instructions about the task, with and 
without explanation of the reinforcers, and a verbal conditioning procedure 
explaining neither). Ss were VA medical patients, 30 per group. Performance 
improved (p< .001) with amount of preliminary information and in all 3 
instructional conditions was better (p<.01) under wrong-blank than right- 
blank. Right-wrong performance was similar to wrong-blank in the problem- 
solving conditions but superior in verbal conditioning. It was suggested that 
the differences among combinations were due to discrepancies in the informa- 
tional properties of blank, and in verbal conditioning, of the overt reinforcers 
as well. The implications of these hypotheses for studies comparing groups 


S differing in personality were discussed. 


Hypotheses concerning the motivational 
and affective reactions elicited by social 
reward and censure, especially as they are 
related to the subjects’ personality character- 
istics, are frequently tested in an experimental 
design which involvés a study of the effects 
of response-contingent verbal reinforcers on 
the performance of a laboratory task. In 


‘many of these investigations, in which the 
.positive or reward condition consists of fol- 
. lowing the subjects’ correct responses with 


some verbal indicator of approval (e.g, 
*Right" or *Good") and the negative or 
censure condition of following incorrect re- 
sponses with some indicator of disapproval 
(e.g., “Wrong”), the reinforcement conditions 
provide subjects with their only information 
about response correctness. Motivational in- 
terpretations of the results of the latter type 
of study appear to rest on the assumption 
that performance differences bétween groups 
or reinforcement conditions are due solely to 
the effects of the overt outcome events. Re- 
cent evidence, however, suggests that in many 
instances this assumption may be erroneous 
and that the interpretation given to the 
empirical data obtained in a number of 
investigations may therefore have to be 
reexamined. 

As Buss and his colleagues (e.g. Buss, 


1 Now at the University of Texas. 


Braden, Orgel, & Buss, 1956) were among 
the first to point out, both positive and nega- 
tive conditions in the type of experimental 
arrangement described above involve rein- 
forcement combinations: each of the subject's 
responses is followed by either an overt signal 
from the experimenter or by nothing (or, as it 
is coming to be called and will be referred 
to here, by a blank). There is increasing 
awareness (e.g., Buchwald, 1959, 1962; 
Levine, Leitenberg, & Richter, 1964) that the 
effects of blank on performance, as well as 
those of the overt outcome events, must be 
considered in accounting for the results of 
these studies. 

In a number of reinforcement combination 
studies (e.g, Buchwald, 1959; Buss et al., 
1956) it has been found that under condi- 
tions in which subjects are informed about 
the task but not about the reinforcement pro- 
cedures, performance on conceptual tasks 
with two response alternatives is better under 
a combination in which the experimenter 
follows incorrect responses with “Wrong” 
(wrong-blank or Wb) than under a right- 
blank (Rb) combination. Performance under 
the former condition has typically been found 
to be equivalent to that under a right-wrong 
(RW) combination in which each of the sub- 
ject’s choices is followed by a response from 
the experimenter. On the assumption that 
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blank has no reinforcement value in either 
combination, Buss et al. concluded from such 
findings that “wrong” must therefore be a 
stronger negative reinforcer than “right” is 
a positive reinforcer. Acceptance of this con- 
clusion does, of course, permit the further 
hypothesis to be made that the properties 
differentiating the two overt reinforcers are 
motivational in nature. However, in a study 
employing a two-alternative verbal-discrimi- 
nation list, the present writer (Spence, 1964) 
obtained data which suggested a very dif- 
ferent interpretation of the empirical results. 
It was demonstrated by means of probability 
analyses that blank in the Wb combination 
was equally effective in producing correct 
responses as “right” in the Rb and RW 
combinations but that blank in the Rb com- 
bination produced less avoidance of incorrect 
responses than “wrong” in the other two. 
Thus the results suggested that the empirical 
differences between Rb and Wb were due to 
discrepancies in the kind or amount of in- 
formation the subjects extracted from blank 
in each combination rather than to differ- 
ences in the magnitude of the effects, motiva- 
tional or otherwise, of right and wrong per 
se.” Similarly, the equivalent performance of 
RW and Wb groups could be attributed to 
blank in the latter combination conveying as 
much information as the overt reinforcer for 
which it stood. 

Levine et al. (1964), on the basis of data 
from a different kind of experimental ar- 
rangement, have recently proposed that in 
the absence of instructions about its mean- 
ing, blank tends to be treated as if it meant 
“right” and have specifically suggested that 
this tendency explains the performance 
equivalence of the RW and Wb conditions 
in the type of reinforcement combination 
study just described. A similar kind of expla- 
nation could be offered for the poorer per- 
formance of Rb subjects: a strong, initial 

2 This explanation of the performance differences 
produced by the two combinations, it should be 
noted, requires no assumption to be made about the 
magnitude of the reinforcing effects of right versus 
wrong per se. Further, the materials used in these 
studies were not of such a nature as to permit 
empirical test of any hypothesis concerning their 
relative strengths. 
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tendency to regard blank as “right” confli 

with the demands of the experimental con 
to treat it as “wrong.” Thus for many sub: 
jects, blank is less effective in the Rb coml 
nation than “wrong” and mastery of the ta 
is therefore retarded. ui 

In all of the studies mentioned above, it. 
will be recalled, subjects were instructed 
about the learaing or problem-solving 2 | 
but not about the reinforcers. It is possibl 
to supplement these instructions by explainy 
ing the reinforcement procedures that will 
be used, as well as informing the subje 
about the task he is expected to master. At 
the other extreme, it is possible to ment 
neither, giving subjects instructions thai 
specify no more than the general type 
responses he is to make to the stimulus 
terials. This latter experimental arrangeme 
it will be recognized, corresponds to th 
*verbal-conditioning" procedure. 

The experimental evidence (e.g., DeNil 
Spielberger, 1963) clearly indicates that, 
dependent of type of reinforcement combi 
tion, performance should be positively relat 
to the amount of information given to thi 
subject in the instructions. A question aris 
however, as to whether the same pattern. 
performance among the combinations 
has been found with subjects informed about 
the task but not the reinforcers occurs? 
under the other two instructional conditio! 
If the inferior performance of Rb subjects 
the former type of study is due solely to 
subjects' inability to comprehend and utiliz 
the information being conveyed by blank with 
equal effectiveness as an overt reinforcer, 0 
might expect that this inferiority would 
minimized or completely disappear if prelin 
nary instructions were given in which 
reinforcement procedures as well as the ta 
were explained. This expectation was Col 
firmed by the present writer (Spence, Lair, 
Goodstein, 1963) in an investigation paral 
to the verbal-discrimination study descr 
above except for the addition of instructio 
concerning the reinforcers, no significant di 
ferences between Rb and Wb conditions ap- 
pearing in groups of schizophrenics, medi 
patients, and college students. However, 
Pishkin (1963) and Lydecker, Pishkin, and 
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Martin (1961) in studies employing a con- 

ceptual task, found that the performance of 

* schizophrenic subjects was significantly better 
with a Wb combination than with Rb, thus 

: duplicating the results of studies using 
uninformed subjects. 

Before considering the effects of different 

„~~ reinforcement combinations on the perform- 

ance of laboratory tasks untler the verbal- 

conditioning procedure, recent evidence con- 

""weerning the “learning without awareness" 

issue should first be mentioned. The results 
of a number of studies (e.g., Dulany, 1962; 
Levin, 1961; Spielberger, 1962) indicate that 

` subjects who, in response to careful question- 
ing in a postexperimental interview, do not 
offer at least a partially correct hypothesis 
about the principle determining the con- 
tingenty between the overt reinforcers and 
their responses show little or no evidence of 
having learned. These data thus contradict 
the conclusions drawn from earlier studies in 
which, on the basis of responses to very brief 
interviews, it appearede that learning often 
took place without the subjects’ awareness of 
the response-reinfórcement sequence. The 
data do show, however, that performance co- 
varies with the ease and explicitness with 

. which subjects are able to state the basis of 
this contingency (e.g., Spielberger, 1962). 

* Formulation of hypotheses about the prin- 
ciple underlying the response-reinforcement 
contingency would seem to presuppose that 
the subject had not only developed the notion 
that there was some sort of specific relation- 
ship between his responses and the experi- 
menter’s behavior but also that he had formu- 
lated hypotheses about the information value 
of the reinforcing events, including the in- 
formation value of blank. Thus the same 
factors which result in blankcbeing less ef- 
fective in the Rb combination than in Wb 
under conditions in which subjects are in- 
formed about the task but not the reinforcers 
might also be expected to be present in the 
verbal-conditioning situation. While there are 
undoubtedly factors operating that are not 
present when subjects are given problem- 
solving instructions, there seems to be no a 
priori reason to suspect that performance 
under an Rb combination would not therefore 
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be inferior to that under Wb and RW when 
a verbal-conditioning procedure is used.* 

Several investigators (e.g., Crowne & Strick- 
land, 1961; Greenspoon, 1955; Leventhal, 
1959) have conducted verbal-conditioning 
studies using both an Rb and a Wb combina- 
tion. However, on what now appears to be 
the erroneous assumption that learning can 
take place without awareness (and that lack 
of awareness is relatively simple to deter- 
mine), they reported data only from those 
subjects who did not verbalize correctly the 
contingency between the overt reinforcers and 
their responses during a brief postexperi- 
mental interview. Thus the results of these 
studies are of little value in determining the 
relative effects on performance of the Wb 
and Rb combinations unless, of course, it so 
happened that the subjects dropped from each 
condition were equal in number and level of 
performance. It is interesting to note that 
Crowne and Strickland reported that in the 
process of obtaining 42 “unaware” subjects 
per group, the data from 15 individuals 
tested under the Wb condition were elimi- 
nated, as opposed to only 1 subject in the 
Rb condition, while Greenspoon dropped the 
data of 9 Wb subjects and 1 Rb subject from 
groups of 25. 

The purpose of the present experiment was 
to obtain additional evidence concerning the 
magnitude of the performance differences 
among various reinforcement combinations as 
related to the kind of information about the 
experimental procedures initially given to the 
subject. A 3 X 3 factorial design was em- 
ployed: 3 reinforcement combinations (Rb, 
Wb, and RW) and 3 instructional conditions 
(information about the task, with and 
without an explanation of the reinforcers, 
and information about neither task nor rein- 
forcers). Following a technique devised by 
Taffel (1955) which has been used in many 
verbal-conditioning studies, the experimental 

3A number of verbal reinforcers in addition to 
"right" and "wrong" have been used in verbal- 
conditioning studies, for example, “Good,” “Mmmm- 
hmm,” “Not so good," “Huh-uh,” etc. For conveni- 
ence of presentation, however, the symbols R and W 
will continue to be used here and will refer in all 
subsequent references to any verbal indicator of 
approval or of disapproval, respectively. 
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task required the subjects to construct sen- 
tences using a given verb and starting with 
one of four personal pronouns. Since dis- 
covery of the correct principle of sentence 
construction (use of a given pronoun class) 
was crucial for its mastery, the task was thus 
more comparable to traditional problem- 
Solving or concept-formation arrangements 
than to rote learning in which memorization 
is paramount. 


METHOD 
Subjects 


The subjects were 270 male patients from medical 
and surgical wards of the Iowa City Veterans Ad- 
ministration Hospital, all of whom were 50 years 
of age or less (M = 41.11), had a minimum of 7 
years of education (M — 10.31), and no history of 
CNS damage, alcoholism, or psychiatric disturbance. 
The subjects were assigned to an instructional condi- 
tion and within each of these, to type of reinforce- 
ment combination in such a manner that the nine 
treatment groups were matched for age and educa- 
tion as well as for frequency of choice of pronoun 
class during initial operant trials. 


Materials 


Seventy 3 X 5 inch index cards were presented in 
random order, each card having a different verb in 
the simple past tense typed in the center and four 
personal pronouns (I, We, He, They) typed in a 
line at the bottom. The serial order of the pronouns 
was varied among the cards, each of the 24 possible 
orders being used an approximately equal number 
of times, 


Procedure 


Preliminary instructions given to all subjects 
stated that for each card, the subject was to make 
up a sentence which started with one of the four 
pronouns and contained the verb. The subjects were 
then given 10 cards (trials) without reinforcement. 
Following these operant trials, the subjects assigned 
to the first of the three instructional conditions were 
given additional, problem-solving instructions. These 
specified that on all subsequent trials, the subject 
was expected to construct a certain kind of sentence 
and that it was his task to try to discover what 
kind of sentence this was and to give as many as 
he could. These subjects were also given an explana- 
tion of the reinforcers to be used, the instructions 
mentioning not only the meaning of the experi- 
menter’s overt response, but also, in the Rb and 
Wb groups, of the experimenter’s failure to respond. 
Henceforth this instructional condition in which the 
subjects were informed both about the task and the 
reinforcers will be referred to as PS-I. In the second 
condition, subjects were given the same instructions 
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about the task but the reinforcement pro 
were not explained. Rather all subjects were 
"it will become clear to you as we go along 
you have made up the right kind of sentence.” 
condition will be referred to as PS-NI. Three 
inforcement combinations were employed for 
subjects given problem-solving instructions, In. 
Rb condition, the experimenter said "Right" | 
correct sentences but remained silent after inco; 
ones, while in the Wb condition, the experiment 
said "Wrong" aftek incorrect and nothing after. 
rect sentences. In the RW condition, the ex 
menter said "Right" or "Wrong" after each e 
tence. The overt reinforcers were delivered in a firn 
but affectively neutral tone of voice. 
In the third instructional condition, the reinforce 
were introduced without comment or explanal 
following the initial 10 trials. Three reinforc 
combinations were also employed but, following 
procedure used in many verbal-conditioning stu 
the words “Good” and “Not so good" were used 4 
the overt reinforcers rather than “Right” 
“Wrong.” Although different words were used, 
three reinforcement combinations used with 
verbal-conditioning procedure will also be lab 
Rb, Wb, and RW. The instructional condition 
be referred to as VC. 
In all nine treatment groups, sentences begi 
with a pronoun in the frst person (I or We) v 
designated as correct for half of the subjects 
sentences beginning with pronouns in the 
person (He or They) were correct for the other h 
Following the 60 reinforced trials, subjects w 
interviewed briefly to determine their ability 
specify what constituted a correct sentence, 
understanding of the information being conveyed 
the reinforcing events, etc. 5 


RESULTS 


in performance that occurred among the re: 
inforcement groups tended to appear on | 

second block of 10 trials and to increase in 
magnitude with successive trials. The 
chosen for statistical analysis were therefo 
the total number of correct responses 
by the subjects on the last 20 trials ( 
41-60). The means of these measures 
each of the nine treatment groups are shown 
in Table 1. It may be seen that the amount 
of preliminary information given to the sub- 
jects affected performance, the PS-I groups 
giving the greatest number of correct re 
sponses and the VC groups the least. A 3 X 3 
analysis of variance indicated that these 
ferences among conditions were statistica 
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TABLE 1 


MEAN NUMBER OF Correct RESPONSES ON TRIALS 41— 
60 FOR THE THREE REINFORCEMENT COMBINATIONS 
AND THREE INSTRUCTIONAL CONDITIONS 


Instructional condition 
ERES PS-I * PS-NI vc 
M SD M SD M SD 
RW | 17.87| 3.95 |"15.17 | 5.04 | 16.03 | 4.44 
* Wb |18.00| 3.43 | 16.27 | 4.82 | 14.00 | 4.07 
Rbi] | 16.73| 4.75 | 13.20 | 4.96 | 11.97 | 4.42 


significant (F = 15.23, df = 2/263, p < .001). 
With respect to the effects of type of rein- 
forcement, it will be observed that the Wb 
combination produced better performance 
than Rb in all three instructional conditions. 
Performance under the RW combination was 
better than under Wb in the VC condition 
but slightly poorer in the two problem-solving 
conditions. The results of the 3 X 3 analysis 
yielded significant terms for both reinforce- 
ment type (F = 6.57, p < .01) and the inter- 
action between reinforcement and instruc- 
tional condition (F = 3.80, p < .05). 
Since the RW groups contributed heavily 
* to the significant interaction term reported 
_ above, a 2 X 3 analysis was performed which 
excluded the data from these subjects. The 
terms for instructional condition (F = 14.33) 
and for Rb versus Wb (F = 9.95) were both 
significant (p < .01) but the interaction be- 
tween the two variables was not (F = 1.04). 
Separate analyses of the data from each 
instructional condition were also made which 
took correct pronoun class as well as type 
. of reinforcement into account. Performance 
tended to be better when I-We responses 
were correct (significant, at “the .05 level, 
only for the PS-NI condition), apparently 
due to initial response bias since more I-We 
than He-They responses also tended to be 
given on the operant trials. None of the inter- 
actions between reinforcement type and pro- 
noun class was significant. As for type of re- 
inforcement, differences among the three com- 
binations or between Rb and Wb were found 
to be significant (p <.05) in the PS-NI and 
VC conditions but not in the PS-I condition. 
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Discussion 
Problem-Solving Conditions 


The results obtained from the two problem- 
solving conditions are consistent with an ex- 
planation which states that blank in the Wb 
combination is equivalent in its effects to 
“Right” but that blank in the Rb combina- 
tion is less effective than “Wrong” in pro- 
ducing avoidance of incorrect responses, That 
is, the RW and Wb groups were similar in 
performance and both were superior to the 
Rb subjects. 

The results of the postexperimental inter- 
view data not only lend support to this gen- 
eral conclusion but also confirm the more 
specific hypothesis, offered earlier, that the 
relative ineffectiveness of blank in the Rb 
combination may come about because of a 
preestablished tendency to treat it as mean- 
ing “Right” which conflicts with the demands 
of the situation to treat it as “Wrong.” Of 
the 30 Rb subjects in the PS-NI condition, 
27 were able to state that blank meant they 
had responded incorrectly, (Of the remaining 
subjects, 1 said it meant “Right,” 1 that it 
meant nothing, and the third said he did not 
know). However, in response to further ques- 
tioning, 17 (63%) of these 27 subjects stated 
that blank did not always mean “Wrong”; 
when the subject was correct, the experi- 
menter sometimes said nothing so that blank 
could mean either “Right” or “Wrong.” In 
contrast, all of the Wb subjects correctly 
identified the meaning of blank and only 9 
(30%) said it meant "Right" only some- 
times. Similar trends were noted in the PS-I 
groups. 

The finding that the Rb combination 
tended to produce poorer performance than 
Wb even when the subjects were informed 
about the reinforcement procedures thus rep- 
licates the results of Pishkin (1963) and 
Lydecker et al. (1961) who also used a 
problem-solving task but not those of the 
present writer (Spence et al, 1963) in a 
study employing a rote-learning, verbal- 
discrimination list, In the latter investigation 
it was demonstrated that blank was equiva- 
lent to an overt reinforcer in both combina- 
tions, leading to no performance differences 
between Rb and Wb. It is possible that in a 
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two-alternative verbal-discrimination situa- 
tion in which it is easy to grasp the notion 
of what constitutes a correct response and 
which requires only that the subject recognize 
the correct stimulus in each pair, the subjects 
have less difficulty in utilizing the information 
provided by the instructions (particularly the 
information that blank in the Rb combina- 
tion signals *Wrong") than subjects in a 
problem-solving situation who are preoccu- 
pied with trying to discover what makes a 
given response correct. 

Contained in this discussion of the results 
of reinforcement combination studies in which 
subjects are given problem-solving instruc- 
tions is an assumption whose implications 
should be made explicit. Assuming that they 
are delivered in an affectively neutral manner, 
words such as “Right” or “Wrong,” it is 
being suggested, do not differ from blank in 
the motivational or affective reactions that 
they elicit or, for that matter, from any other 
affectively neutral method of conveying the 
same information to the subject. Thus no 
reinforcement combination is more motivating 
or affect-arousing than any other. Differences 
in performance between various combina- 
tions, it is being postulated, occur only be- 
cause of informational discrepancies between 
them. Similarly, interactions between rein- 
forcement combinations and groups differing 
in some personality characteristic would be 
expected only if the groups differed in their 
intellectual capacity to utilize the information 
being conveyed by the outcome events (e.g., 
schizophrenics and nonschizophrenics). 


Verbal Conditioning 


The results obtained from the VC condi- 
tion of the present study support the view, 
presented earlier, that the same informational 
discrepancies that occur when subjects are 
given problem-solving instructions also oper- 
ate in this situation to produce better 
performance in the Wb than in the Rb 
combination. However, consideration of the 
performance data from all three reinforce- 
ment groups in conjunction with the inter- 
view data suggest that another factor of equal 
or greater importance must be taken into 
account in interpreting the results. 
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The instructions given in the problem- 
solving conditions essentially specified that < 
each of the subjects’ responses would be . 
classified as correct or incorrect according to 
some principle they had to discover. The 
interview data suggested that whatever they 
made of blank, none of the subjects had any 
difficulty in grasping the notion that “Right” 
indicated that the sentence they had just 
given was of the type the experimenter 
wanted them to make up or that "Wrong". 
indicated it was not. In the VC condition, 
on the other hand, the subjects had to dis- 
cover for themselves that there was a specific 
contingency between the reinforcers and their ` 
responses, that is, the notion that the experi- 
menter was classifying their sentences into 
two categories, only one of which was ac- 
ceptable. The interview data indicated that 
not all subjects in the verbal-conditioning 
situation were able to verbalize that such a 
contingency had occurred, a product not only 
of the absence of problem-solving instructions 
but also, perhaps, of the use of the more 
ambiguous “Good” and “Not so good” rather 
than “Right” or “Wrong.” The interview 
responses also suggested that the probability 
of the subjects discovering this contingency 
(and hence, presumably, the likelihood that , 
they would develop correct hypotheses about 
the basis for it) was related to reinforcement 
condition. When questioned about the overt 
reinforcers, 26 of the 30 RW subjects as 
opposed to 23 of the Wb subjects gave some 
indication that they recognized a specific 
contingency, even though they may have had 
no hypothesis about what made a sentence 
“good” or “not so good." The rest simply 
expressed confusion about the significance of 
the overt reinforcers (e.g., “don’t know,” “I 
wondered"). Of the 30 Rb subjects, only 19 
were able to state that the reinforcers were 
related to some characteristic of specific sen- 
tences. Four of the Rb subjects, in fact, 


denied hearing the experimenter say anything » 


during the administration of the task. Three 
others gave general hypotheses about the re- 
inforcers (“just to encourage me and make 
me feel at home,” “you were making con- 
versation,” “prompting me to go on"). In 
the absence of problem-solving instructions 
“Good” is apparently more easily dismissed 


REINFORCEMENT AND PROBLEM SOLVING 


, than is “Not so good” or a combination of 


the two as a polite murmur, essentially 
irrelevant to the task. 

Those unable to interpret the overt rein- 
'forcer accurately (link it to the experi- 
enter's approval or disapproval of a given 
sentence) were even less likely, of course, to 


«be able to interpret blank. Thus differences 


^ 


among the groups in the information ex- 
tracted from both the*overt reinforcers and, 
^in the case of the Rb and Wb subjects, 
from blank may have been responsible for 
the superiority of RW over Wb and of both 
„over Rb. 

With respect to the contribution of per- 
sonality variables to the performance of lab- 
oratory tasks administered under the verbal- 
conditioning procedure, it seems reasonable 
to assüme that groups that also happen to 
differ initially in level of intellectual func- 
tioning (as often occurs when psychopatho- 
logical groups are being compared with nor- 
mals) will differ in performance under a given 
reinforcement combination. Interactions be- 
tween reinforcement combination and group 
might also be expected because of purely 
intellectual factors, the magnitude of the dif- 
ferences between combinations being greater 

. in those functioning at a lower level due to 
their inability to utilize the information being 


, conveyed by the reinforcers as effectively. 


However, in contrast to what was sug- 
gested earlier about the role of personal- 
ity variables in reinforcement combination 
studies using problem-solving instructions, it 
does not seem reasonable to claim that dif- 
ferences in performance between personality 
groups occur only when there are intellectual 
differences between them. Several investiga- 
tions (eg. Crowne & Strickland, 1961; 
Epstein, 1964) demonstrating ‘a relationship 
between verbal-conditioning performance and 
a personality variable have in fact directly 
shown that the groups were comparable 
in IQ. However, the interpretation almost 
automatically given to the results of these 
Studies—that differences in performance 
come about because the positive overt rein- 
forcer (e.g. *Good") has greater incentive 
value for one group than another or that 
the negative overt reinforcer (e.g., “Not so 
good”) is more punishing—does not neces- 
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sarily follow. An alternate possibility is that 
these personality factors are related more 
generally to the subjects’ alertness or sensitiv- 
ity to environmental cues, particularly to the 
nuances of the experimenter’s behavior, and 
hence to the likelihood that they will become 
aware of the various contingencies that are 
occurring, The willingness of the subjects to 
respond in the manner they believe the ex- 
perimenter expects—their behavioral inten- 
tions, as Dulany (1962) calls them—may 
also differ among personality groups, 

Explanations of this type, it will be noted, 
are consistent with the writer’s earlier hy- 
pothesis that unless the experimenter’s ver- 
balizations very definitely state or imply by 
intonation that the subject is exceeding or 
falling below an acceptable standard, overt 
reinforcers of the type used in these rein- 
forcement combination studies may have no 
greater incentive value or affective impact 
than blank. While individuals undoubtedly 
attach different degrees of importance to 
being right or being wrong, or, more gen- 
erally, to doing well or pleasing the experi- 
menter, the procedure of labeling only correct 
responses may nonetheless be equally reward- 
ing for any given subject as the procedure 
of labeling only incorrect responses—and 
equally punishing. As long as there are no 
informational discrepancies between them, 
both combinations may be functionally equiv- 
alent to the combination in which the experi- 
menter labels both correct and incorrect 
responses, 
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EXPLORATIONS INTO THE EFFECTS OF PICTURE 
CUES ON THEMATIC APPERCEPTIVE EXPRESSION 


OF ACHIEVEMENT MOTIVATION ' 


JOSEPH VEROFF, SHEILA FELD; ann HARRY CROCKETT ë? 


Survey Research Center, University of Michigan 


The validity of thematic apperceptive measures of achievement motivation 
based on edifferent pictures was tested separately for Ss from varying social 
groups. Motive scores were related to 2 other measures for which an association 
was theoretically expected. It was found that for predicting upward occupational 
mobility from high achievement imagery, the picture of a motoric type of 
occupational setting was more effective for men of conceptual backgrounds 
and that the picture with a conceptual setting was more effective for men of 
motoric backgrounds. The predicted association between moderate feelings of 
social inadequacy and high achievement imagery was found to some extent 
for full-time housewives’ stories written in response to a career setting. Results 
suggest that the validity of the apperceptive measures of achievement motive 
will be enhanced by using pictures of situations that are not very familiar to Ss. 


This paper explores a general methodo- 
logical question about thematic apperceptive 
devices for measuring motives: can com- 
mensurate data be obtained from people of 
different social backgrounds if the same stim- 
uli are used to elicit the fantasy? Recently 
this important issue has been stressed by 
Atkinson (1958b), Veroff (1961), and Veroff, 
Atkinson, Feld, and Gurin (1960) in con- 

- nection with the thematic apperceptive mea- 
surement of the achievement motive (Mc- 
Clelland, Atkinson, Clark, & Lowell, 1953). 
'The issue has a long history of discussion, 
especially in Henry's (1947) pioneering cross- 
cultural work and in Lindzey's (1961) re- 
cent summary of cross-cultural studies using 
„projective techniques. 

* The investigators of achievement motiva- 
tion view the behavior evidenced in stories 
1The investigations reported in this paper are 
based on data obtained within a national sample 
survey supported by the Jointe Commission on 
Mental Illness and Health (Gurin, project director). 
Analysis of these data was supported by two grants 
from the National Institute of Mental Health, United 
States Public Health Service, Projects M2181 
(Veroff, principal investigator) and M2280 (Gurin, 
principal investigator), and a special grant from. the 
University of Michigan's Rackham School of Gradu- 
ate Studies Award (to the the senior author). 
2Now at the Mental Health Study Center, Na- 


Hone Institute of Mental Health, Adelphi, Mary- 
and. 


3 Now at the Department of Sociology and Anthro- 
pology, University of North Carolina. 


told to pictures as a joint function of a sub- 
ject’s enduring motivational predispositions 
and his current life situation, including the 
momentary testing situation, as well as the 
situations portrayed in the pictures used as 
stimuli. The pictures are assumed to operate 
as cues that elicit in the subject reactions that 
are based upon his past experiences in set- 
tings similar to those portrayed in the pic- 
tures (Atkinson, 1958b; McClelland et al., 
1953). While not a necessary corollary of 
this position, researchers of the achievement 
motive have generally assumed that the 


closer the situation being portrayed in the pictures 
to the life situation of the individual the more 
readily he can identify with the characters in the 
story and hence the more likely the motive measures 
will be good estimates of motivational behavior in 
the person’s ongoing daily activity [Veroff, 1961, 
p. 961. 


This paper will raise some question about this 
assumption. 

Valid measurement of the achievement 
motive would ideally require the selection of 
pictures that depict situations for which per- 
sons from widely differing social backgrounds 
have the same achievement expectancies, This 
research goal is difficult, if not impossible to 
attain, However, as Atkinson (1958b) has 
pointed out, since the index of motive strength 
is usually based upon the imagery from a 
series of pictures, 
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we need only assume that the average strength of 
a particular expectancy (eg, the expectancy of 
achievement or of power, etc.) aroused by all of the 
pictures in the series is approximately equal for all 
subjects. . . . This condition is likely to be approxi- 
mated when the situations portrayed in the pictures 
are representative of a wide variety of life situations 
in which people can satisfy the particular motive 
[p. 609]. 

It will be noted that these assumptions 
concerning the types of pictures that should 
be used to elicit valid apperceptive measures 
of motivation do not rely heavily upon the 
“projective” nature of these measuring instru- 
ments, and do not stress the elicitation of 
unconscious fantasies in thematic appercep- 
tive devices. These departures from the typi- 
cal clinical view of thematic apperceptive 
techniques have been justified to the degree 
that the motive measured does not arouse 
conflict for the subject (Atkinson, 1958b). 

Because both the clinical and typical cross- 
cultural uses of thematic apperceptive tech- 
niques are designed to tap unconscious as 
well as conscious needs, the problem of de- 
fining ideal pictures to use is more compli- 
cated. From the time of the first development 
of the Thematic Apperception Test as a clini- 
cal device, psychologists have suggested there 
be some similarity between the subject and 
figures in the pictures to evoke appropriate 
identification, Murray (1938) proposed that 
each picture have at least one figure with 
whom the subject could readily identity, and 
suggested, therefore, separate sets of pictures 
for males and females, and for children, 
young adults, and older persons. On the other 
hand, Tomkins (1947) and others have 
pointed out that much of the value of the 
TAT would be lost if the identification with 
figures in the pictures was very easy since 
repressed material is assumed to be more 
easily elicited when the materials are remote 
from the subject. 

The TAT has been modified to match sub- 
jects and pictures (Alexander & Anderson, 
1957; Henry, 1947; Lessa & Spiegelman, 
1954). The unmodified TAT has also been 
used in other cross-cultural researches 
(Caudill, 1949; Vogt, 1951). According to 
Henry (1956) it is 
probable that the criterion of culture-appropriate- 
ness (for selecting pictures) need imply only ap- 
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propriateness to the general culture area rather 
than to any specific subgroup in the culture [p. 51], 


Lindzey (1961) emphasizes the need for sti 
ulus constancy in order to compare groups of 
subjects but points out that constancy 6: 
meaning and value rather than o" 
stimulus constancy is the issue. 

There has been a dearth of research dealing g 
with the validity of measures of motivation 
using stimuli of varying degrees of similari 3 
to different subgroups. In Murstein's (1961) 
recent summary of research on variations 0: 
stimulus similarity in thematic apperceptive 
techniques, most cited studies dealt with 
formal internal characteristics of stories such 
as transcendence (Weisskopf & Dunlevy, 
1952) or length (Thompson, 1949), Rela- 
tively few of the studies cited by Murstein or 
more recent work dealt with thematic content 
of stories (e.g., Feshbach, Singer, & Fesh- 
bach, 1963; Lesser, Krawitz, & Packard, 
1963; Rosenstein, 1952; Veroff, Wilcox, & 
Atkinson, 1953) and still fewer with external | 
validating criteria for measures derived from. 
stories (Birney, 1955; deCharms, Morrison, 
Reitman, & McClelland, 1955; Silverstein, 
1959; Veroff et al., 1953). 

The present study represents an attempt ‘a 
examine the predictive validity of a set of 
achievement-motive scores based on pictures. j 
that vary in their similarity to the work ex 
periences of different population subgroups. 
The validity of these scores is compared for 
different social groups assumed to have had 
different experiences with the work of situa- 
tions portrayed in the pictures. Two predic- 
tions based on the theory of achievement mot 
tivation developed by McClelland et al. 4 
(1953) and Atkinson (1957) and suppo 
by previous research were retested under 
above conditions. Each of these predictions is^ 
discussed below. il 

One prediction is that achievement motive 
is positively related to intergenerational oi 
cupational mobility in men. The rationale for. 
this prediction is stated by Crockett (19628 


strong achievement motive should lead to more. 
"realistic" striving, to greater effort, and to greal er 
persistence than weak achievement motive, and, 35^ 
a consequence to greater accomplishment in the 0€-. 
cupational sphere [p. 195]. 


i 


* 
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PICTURE CUES AND NEED ACHIEVEMENT 


This prediction will be tested using two 
groups of men whose occupations could be 
described either as conceptual or motoric in 
orientation. Furthermore, two sets of pictures 
portraying conceptual and motoric occupa- 
tional settings will be utilized. This distinc- 
tion is similar but not identical to a blue- 
collar (motoric)-white-collar (conceptual) 
distinction. There are some fobs in the blue- 
collar setting, howevef, that would be consid- 
“ered conceptual (e.g., certain farm managers) 

and some in the white-collar setting that 

would be considered motoric (e.g., a self-em- 
. ployed watch repairman). 

The second prediction that is reconsidered 
is that the achievement motive is positively 
related to moderate feelings of inadequacy in 
women. This prediction was tested using 
marriéd women who were either full-time 
housewives or who were employed on a full- 
time basis outside the home. Two sets of pic- 
tures portraying women at work in the home 
and in business settings outside the home, 
were the bases for athievement-motivation 
Scores, 

This prediction fs derived from Atkinson’s 
(1957) risk-taking model which proposes 
that people with high achievement motives 

- set achievement goals that are just a little 

, above their previous performance while people 
with low achievement motives set goals ap- 
preciably above or below their current per- 
formance, One implication from this model is 
that people who have high achievement mo- 
tives will always experience some discrepancy 
between attainment and aspiration, Such 
people would also have some but not very 
strong feelings of inadequacy. People with 
low achievement motive will either suffer no 
discrepancy because they set very low aspira- 
tion levels or suffer very strong feelings of 
inadequacy because their aspirations were 
unrealistically high. 

Precedence for the linkage between feel- 
ings of inadequacy and the risk-taking model 
comes from a reanalysis of data collected by 
Martire (1953) who had found a positive 
correlation between achievement-motive scores 
and degree of discrepancy between actual and 
ideal evaluation of one’s own achievement 
characteristics. These discrepancies can be 
considered bases for feelings of inadequacy. 
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Reanalysis showed that moderate discrepancy 
on the trait of general success showed the 
strongest relationship to achievement motive: 
44% of those 27 subjects high in achievement 
motive have moderate discrepancies between 
ideal and actual self-evaluation on the trait 
of general success; this is true for only 19% 
of those 26 subjects low in achievement mo- 
tive. An exact test, comparing the high and 
low motive groups’ distribution on moderate 
versus low or high discrepancy yielded a p 
value of .046, with a one-tailed test. 

The point at issue in this paper is whether 
the predictions of a positive association be- 
tween achievement motive and intergenera- 
tional occupational mobility in men or mod- 
erate feelings of inadequacy in women are dif- 
ferentially supported by scores derived from 
pictures varying in their similarity to the 
subjects’ life situations. The traditional view 
in achievement-motive research would be that 
the predicted relationships would be strongest 
when pictures depicted social situations fa- 
miliar to the subjects. 


GENERAL PROCEDURE 
Subjects 


The subjects comprise subsets of the 2,460 re- 
spondents interviewed in a nationwide study of 
mental health (Gurin, Veroff, & Feld, 1960) con- 
ducted in 1957 by the Survey Research Center under 
the sponsorship of the Joint Commission on Mental 
Illness and Health. The total sample was chosen by 
means of area probability sampling to be representa- 
tive of United States adults 21 years of age and 
older, residing in private households. The subjects 
interviewed represented 87% of those originally 
chosen for the sample. (For further details on sam- 
pling procedures see Gurin et al., 1960, Appendix I.) 

Approximately two-thirds (N=1,619) of the 
sample were randomly selected to receive the the- 
matic apperceptive measures. Of these subjects, 1706 
of the men and 14% of the women did not ade- 
quately complete the thematic apperceptive task. 
These losses reduced the representativeness of the 
sample, especially from the lower end of the educa- 
tional spectrum. If these subjects are primarily low 
in achievement motive then attenuated relationships 
will be obtained. 


Procedure 


All subjects were interviewed in their homes by 
interviewers trained in survey research techniques. 
In most instances no one else was present during the 
interview. Standard interview schedules, which speci- 
fied the questions to be asked as well as their se- 


174 


quence, were used. The total interview lasted about 
1.5 hours. 


Measurement of the Achievement Motive 


"The general procedure described in detail else- 
where (Veroff et al., 1960), was to present the sub- 
ject with a picture and ask him to tell a story about 
it while looking at the picture. A series of questions 
about each picture was used to provide a basis for 
telling the story. The questions dealt with what was 
happening in the picture, what led up to this, what 
the people want and feel, and how this would end. 
Probes were kept to a minimum. The interviewers 
attempted to record the stories verbatim, Each sub- 
ject was presented with six pictures in a standard 
sequence *; different pictures were used for men and 
women.5 

The stories were scored according to the pro- 
cedures described by Atkinson (1958b). Three 
checks of intercoder scoring agreement for total 
achievement-motivation scores yielded rank-order 
correlations of .89, .77, and .81. The median per- 
centage agreement in coding achievement imagery 
was 75. 

In the present study, the presence or absence of 
achievement imagery in stories told to each picture 
was used as the measure of achievement motive in 
order to avoid the necessity for correcting achieve- 
ment-motive scores for the effects of verbal fluency.® 
Achievement imagery can be scored under three 
general criteria, as follows: 

1. Standard of excellence. Competition with an 
evaluative standard of good performance is a con- 
cern of a character in the story, for example, doing 
a good job, doing something well. 

2. Unique accomplishment. A unique outcome 
which will be generally accepted as a personal ac- 
complishment is a goal of a character, for example, 
an invention, 

3. Long-term involvement. The achievement of a 
long-term goal is the concern of one of the char- 
acters, for example, a career goal. 


Achievement Imagery and Occupational 
Mobility in Men 
This analysis is confined to that portion of the 
national sample of 597 men with adequate thematic 
apperceptive protocols who are white, employed 
full time, and who were not reared on farms.* 


*In a study using these same pictures, college 
subjects wrote stories about the pictures. The mea- 
sures were administered in a Latin-square design 
controlling order of picture presentation. There were 
no significant order effects (Veroff et aL, 1960). 

5 Copies of these pictures are available (4X 6 
inch pictures). They may be purchased from the 
Survey Research Center, Ann Arbor, Michigan. 

9'These scores had been corrected for differences 
in verbal fluency. See Veroff et al. (1960) for 
these procedures. 

7 These subjects are not identical with those used 
by Crockett; his subjects included men who were 
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Occupational Mobility 


Occupational mobility is defined by Crockett’s 
(1960) criteria. Occupations of subjects and their 
fathers were assigned prestige ratings based on a 
previous national survey that yielded estimates of 
prestige values (National Opinion Research Center, 
1953). The distribution of these ratings was divided 
into four approximately equal groups that form the 
basis of the present analysis. Ratings of occupa- 
tional prestige of* fathers and sons in these four 
categories were then compared to produce two major 


groups: an upwardly mobile group (occupational 


prestige of sons higher than of fathers); and a 
combined downwardly mobile (occupational pres- 
tige of sons lower than of fathers) and stable group 
(occupational prestige of sons and fathers similar). 


This procedure disregards the original level of oc- © 


cupational prestige of the fathers; the relative dif- 
ference of occupational prestige of fathers and sons 
is the only consideration. Following Crockett, we 
considered only those subjects for whom upward 
mobility was possible, that is, subjects whose fath- 
ers’ occupations were not in the highest prestige 
category. 


Male Pictures 


The pictures used to elicit fantasies are the essen- 
tial comparison points of this analysis. For men, two 
of the six pictures used had sufficient achievement 
imagery to warrant a detailed’analysis in this study, 
although supplementary results from other pictures 
were also analyzed. The two pictures that elicited 
the most frequent imagery were: 

Picture 1: Two men dressed in work clothes, 
working in a shop at a machine. This is sometimes 
seen as inventors. 

Picture 2: Four men grouped around a table with 
coffee cups on it. One man is writing on a sheaf of 
papers. The men are wearing short-sleeved shirts 
and no ties. 

The other two pictures that were considered are: 

Picture 4: Man seated at a drafting board, wear- 
ing a white shirt and tie. 

Picture 5: Conference group—seven men variously 
grouped around a conference table. All are wearing 
business suits and ties. The numbers used above de- 
scribe the sequence of the pictures in the set of six 
pictures used in the present study. Pictures 1, 2, 4, 
and 5 are, respectively, Pictures 2, 101, 28, and 83 
in Atkinson’s list of pictures (Atkinson, 1958, P. 
832). The first picture was usually interpreted as 4 


not employed full time. This further restriction re- 
sulted in a reduction in the number of subjects from 
368 to 311. The further restriction was added because 
we considered only full-time employment as à" 
accurate gauge of occupational attainment. We have 
no reason to think that the further restriction to 
the present sample places any bias on the relation- 
ship between achievement motivation and mobility ; 
if anything, we anticipated that it would sharpen 
the association. 
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blue-collar setting where persons were working with 
their hands (motoric). The second picture is some- 
what more ambiguous, but the stories indicated 
that most interpretations were of a white-collar 
setting where persons were working with their 
heads (conceptual), although it was sometimes seen 
as a social group of men with unidentified occupa- 
tions. . 


Male Occupational Criteria... 


In order to examine, the effects of the different 
type work situations portrayed in the pictures upon 
the validity of the achievement imagery scores 
based on stories told to these pictures, the men were 
differentiated by occupation in a way that was 
parallel to the differences that existed in the pictures. 
This differentiation was based upon a measure used 
by Crockett (1960). It divided men into those 
whose occupations demanded a conceptual orienta- 
tion to work, N — 149, and those whose occupations 
demanded a motoric orientation to work, N — 166. 


Analysis Procedure 


The prediction that upward occupational mobility 
would be associated with high achievement imagery 
was separately tested for those subjects from motoric 
and conceptual occupations; the significance level of 
this directional predictioh was evaluated with a 
one-tailed test for the 2 X 2 chi-square value. The (oj 
statistic (Peters & V£n Voorhes, 1940, p. 398) is used 
to estimate degree of association. Two achievement 
imagery scores were used, one based on Picture 1, 
depicting a motoric work setting, the other based on 
Picture 2, depicting a conceptual situation. 


Male Results 


There was only one picture which yielded a 
difference approaching significance in either 
the gross amount of achievement imagery or 
in the kinds of achievement imagery expressed 
by men from either occupational category 
(Table 1), Men in conceptual occupations 
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are more likely to respond with achievement 
imagery to the motoric picture (Picture 1); 
and this difference seems totally attributable 
to achievement imagery of the unique ac- 
complishment type. Unique accomplishment , 
is the criterion most frequently used to score 
stories to this picture in both groups. On the 
basis of the remaining data, however, there is 
no reason to suppose that the other pictures 
have differential achievement cue strength for 
persons working in motoric and conceptual 
settings. 

The predictive usefulness of the achieve- 
ment imagery scores from the two major 
pictures do seem to differ for persons from 
differing social backgrounds. Table 2 presents 
the relationships between the presence or ab- 
sence of achievement imagery to each picture 
and the presence or absence of upward occu- 
pational mobility, separately for each of the 
two occupational groups. In three of the four 
comparisons there is a positive trend in the 
predicted direction; that is, Crockett's general 
finding about the positive association between 
achievement motive expressed in fantasy and 
intergenerational occupational mobility exists 
in each of the subgroups for each of the pic- 
tures. However, the strength of the relation- 
ship varies for the occupational groups in in- 
teraction with the particular picture. 

Dissimilarity between occupational status 
and the occupational setting of the pictures 
used to elicit stories yielded the most valid 
achievement imagery scores. For Picture 1, de- 
picting a motoric occupational setting, the re- 
lationship between occupational mobility and 
achievement imagery is significant only for 


TABLE 1 


PERCENTAGE OF MEN GIVING PARTICULAR TYPES OF ACHIEVEMENT IMAGERY 
e (By PICTURE AND OCCUPATION) 


j Picture 2 Picture 3 Picture 4 
Machine s Table group Draftsmen Conference 
Type of achievement 
imagery 
Conceptuale| Motorich Conceptual| Motoric Conceptual] Motoric | Conceptual] Motoric 
Standard of excellence 7 8 17 19 9 10 EA 12 
Unique accomplishment} 30 19 5 5 ^ m x Ed 
Long-term involvement 1 3 5 45 
All types 28* 25 29 12 m 11 12 


= 149 men whose occupations were conceptual. 
166 men whose occupations were motoric. 


* <1%. 
439 23.51, df =1,p < 10, two-tailed test. 
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persons from conceptual occupations. For 
Picture 2, depicting a conceptual occupa- 
tional setting, the relationship between oc- 
cupational mobility and achievement imagery 
is significant only for persons from motoric 
occupational backgrounds and not for persons 
from conceptual occupational backgrounds. 

We have also examined the effectiveness of 
the different criteria of achievement imagery 
in generating an association with occupa- 
tional mobility. For that analysis, the criteria 
of unique accomplishment and long-term in- 
volvement were combined since both usually 
reflect a statement of career involvement. The 
relationship between occupational mobility 
and the presence of achievement imagery 
based on the career involvement criteria ver- 
sus the standard of excellence criterion was 
then compared separately for each picture and 
each occupational grouping. There were no 
significant differences between the effective- 
ness of these two types of criteria. There were 
just a few cases responding with a particular 
type of achievement imagery in certain in- 
stances, and so this lack of differences should 
be evaluated with caution. It should be re- 
called, however, that for both occupational 
groupings the unique accomplishment criterion 
was most used in the scoring of the stories 
told about the picture of the machine shop 
and the standard of excellence criterion was 
used most frequently in scoring the stories 
about the table group scene. 


TABLE 2 


ACHIEVEMENT IMAGERY AND OCCUPATIONAL MOBILITY 
IN MEN (Bv PICTURE AND OCCUPATION) 


Achievement imagery | 
Picture and y = 
occupation x c 
Present | Absent 
1, Machine shop 
Conceptual 718 52 3.90* 26 
‘ aa | G5 
Motoric 28 26 «1 .00 
(47) | (115) 
2, Table group 
Conceptual 56 60 «1 .00 
(34) (98) 
Motoric 40 21 6.29% | .30 
(45) (118) 


a Entries are the percentage of men in each achievement 
imagery group who are upwardly mobile. 
b Nsappear in parentheses. 
*p «.025. 
**p <01. 
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Discussion of Male Results 


It would seem to be an error to disregard 
the interaction of picture cues and social ex- 
perience, There is evidence that the validity 
of achievement-motivation measures based on 
content analysis of stories told to the same 
stimulus pictures differs for persons with dif- 
fering work experiences. These results are con- 
gruent with the theoretical analysis of the- 


matic apperceptive measurement of motives ~" f’ 


proposed by Atkinson (1958b). The specific 
findings that dissimilarity between occupa- 
tional setting of pictures and the subject's 
background yielded achievement-motivation 
scores with better prediction to occupational 
mobility than did similarity, is not antici- 
pated by that theoretical model, or by other 
theoretical analyses that rely upon the stimu- 
lus generalization concept (e.g, Goss & 
Brownell, 1957). 

It might be argued that the type of occu- 
pational background in the subject's early 
experience (father's cccupation) may be a 
more relevant dimension of similarity from 
which generalization to pictures of work set- 
tings should be predicted. That position would 
be based on the assumptions that early ex- 
periences are crucial in the development of 
motive dispositions and that the setting of 
one's father's work during childhood provides 
cues relevant to the elicitation of achieve- 
ment motivation. Therefore, we reanalyzed 
the data dichotomizing the subjects! fathers' 
occupations into motoric and conceptual and 
tested the relationship between achievement 
imagery and occupational mobility for each 
of these groups, separately for Pictures 1 and 
2. The results yielded similar though not as 
striking evidence for the validity of using 
dissimilar pictures to those based on the sub- 
jects’ current work status. For Picture 1, the 
machine shop scene, C = .31, p < .025 for 
subjects from conceptual backgrounds, while 
C —.08 for subjects from motoric back- 
grounds. For Picture 2, the table group, C — 
.04 for conceptual subjects and .14 for mo- 
toric subjects; this trend is in the expected 
direction, but neither relationship was signifi- 
cant. 

Furthermore we looked into the relative 
effectiveness of the different criteria used to 
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code achievement imagery for subjects whose 
fathers were from different backgrounds. The 
subjects differed neither in total imagery or 
type of imagery produced nor in the relative 
effectiveness of types of imagery in produc- 
ing a relationship to occupational mobility. 
It is difficult to interpret why pictures that 
are relatively similar to the situation subjects 


' are currently experiencing are not as effective 


as pictures that are relatively dissimilar. A 
number of possible explanations could be 
offered, None of them explains all the results, 
but each seems to have some merit. 

One approach would be to assume that 
there is conflict about the expression of 
achievement motivation, and further to as- 
sume that the generalization gradient for 
avoidance behavior is steeper than that for 
approach behavior. This would imply that 
more approach behavior, that is, the expres- 
sion of achievement motivation, would occur 
in scenes dissimilar to real-life achievement 
situations. Thus, men with conceptual work 
experiences would reveal more achievement 
motivation in stories told to pictures of mo- 
toric work settings, and men with motoric 
occupations would reveal more achievement 
motivation in stories told to pictures of con- 
ceptual work scenes. Only the data for men 
engaged in conceptual occupations fit this 
model; they show greater amounts of achieve- 
ment imagery to Picture 1, the machine shop 
scene, than to Picture 2, the table group. But 
there are no differences in the amount of 
achievement imagery elicited by these two 
pictures for men with motoric occupations. 

Another type of explanation seems called 
for. It appears helpful to consider the two 
different functions that fantasy behavior can 
serve—one an unrealistic wish fulfillment 
that acts as a substitute for overt behavior, 
and the other, a portrayal of the motivations 
commonly enacted in the situations. In the 
first instance, there may be a negative cor- 
relation between fantasy behavior and overt 
behavior in the situations portrayed. In the 
second instance there would be a correspond- 
ence between overt and fantasy behavior. 
One might speculate that since the machine 
shop setting (Picture 1) realistically offers 
little career mobility in the life experience of 
those from motoric backgrounds, achieve- 
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ment fantasies are a clear instance of unreal- 
istic “wish fulfillment.” However, similar fan- 
tasies from subjects from conceptual back- 
grounds who do not have a reality-based 
appraisal of the motoric setting might be tied 
more closely to their own ongoing motivation. 
Reviewing the subcategory analysis of the 
achievement-motivation scoring lends some 
support to this interpretation for Picture 1. 
For the picture of the machine shop we found 
a predominance of the unique accomplish- 
ment criterion for all subjects (Table 1). 
This type of imagery might easily be inter- 
preted as “unrealistic” fantasy for the sub- 
jects from motoric occupations whose jobs 
belie the probability of the occurrence of 
unique accomplishments. In contrast, similar 
imagery by those from conceptual back- 
grounds may be a positive projection of their 
own goals into an unfamiliar situation. Thus, 
the first explanation seems to help understand 
the results obtained from this picture, But 
for Picture 2 representing the conceptual sit- 
uation, the standard of excellence criterion 
prevailed and for neither group of subjects 
could such fantasy clearly be viewed as un- 
realistic. 

Another type of explanation seems possible 
for Picture 2. Henry (1956) pointed out that 
seeking culturally appropriate pictures raises 
the danger of eliciting stereotyped responses 
representing the culturally determined mean- 
ings that members of the society or subgroup 
habitually attribute to “stock” situations or 
objects portrayed. In the present study, the 
stories told to Picture 2 by subjects whose 
fathers or who themselves have conceptual oc- 
cupations, or the stories told to Picture 1 by 
subjects whose fathers or themselves have 
worked with their hands, may reflect habitual 
subgroup modes of interpreting such situa- 
tions rather than individual, motivationally 
determined perceptions. Once again, a more 
refined analysis of achievement imagery seems 
useful. Picture 2 elicited a predominance of 
achievement imagery based on the standard 
of excellence criterion. This type of imagery 
may very well be a habituated response for 
the subjects from conceptual jobs and there- 
fore not predict individual differences in mo- 
bility. In contrast, for those from motoric 
jobs who are not typically in conceptual work 
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situations, this kind of imagery may represent 
motivationally determined perception. In the 
case of Picture 1, where “unique accomplish- 
ment” was the criteria most often used in 
scoring achievement imagery, it would be 
difficult to construe such imagery as culturally 
stereotyped responses for either occupational 
grouping. 

In short, it seems that no one explanation 
suffices, and yet several are tenable, each 
perhaps more useful in particular instances 
when analyzing results from different pictures. 
The various explanations are not contradic- 
tory. Under conditions of either conflict about 
expression of motivation, or of strong cul- 
tural determination of appropriate expression 
of motivation, dissimilar pictures should yield 
more valid indices of subjects’ own typical 
motivation, Furthermore, the use of unfamil- 
lar settings seems to avoid the flight into 
blatantly unrealistic wish fulfillment fantasies. 


Achievement Imagery and the Expression of 
Feelings of Social Inadequacy in Women 


Previous findings on the validity of wom- 
en's achievement-motivation scores based on 
thematic apperceptive techniques have been 
both limited and inconsistent. This made diffi- 
cult our search for a previously tested rela- 
tionship that could be reassessed with the 
available data while at the same time consid- 
ering the relevance of the similarity between 
the subject's social situation and the situa- 
tions portrayed in pictures. Despite differences 
in operations and the use of male subjects, 
these conditions were most nearly met by an 
earlier study by Martire (1953) where a 
positive relationship was found between gen- 
eralized strong achievement motive and dis- 
crepancy between ideal self-concept (those 
traits which are important to an individual) 
and actual self-concept (those traits which are 
characteristic of an individual). That study 
was based upon the assumption that persons 
with high achievement motives would be char- 
acterized by higher levels of aspiration than 
would persons with low achievement motives. 
We have already indicated that a reanalysis 
of these show that moderate discrepancies are 
related to strength of achievement motive 
scores, following Atkinson’s thinking on de- 
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terminants of risk-taking. Therefore we used 
this finding as a basis for evaluation of the 
validity of women’s achievement-motive 
scores. 


Subjects 


The subjects in this analysis consist of that seg- 
ment of the national sample of subjects with ade- 
quate thematic apperceptive protocols who are 
women, white, presently married, have children, and 
are either full-time housewives (NW = 148), or who 
hold full-time jobs outside the home (N — 30). 


Measure of Feelings of Inadequacy 


The measure of feelings of inadequacy is based 
upon the summation of responses to three questions 
asked later in the same survey in which the sub- 
jects’ achievement motive was assessed. These 
questions attained the highest loadings on a factor 
we have called “social inadequacy,” isolated in a 
factor-analytic study of 19 indices of self-eValua- 
tions of distress (Veroff, Feld, & Gurin, 1962). The 
questions and the scored responses are: 


Many people when they think about their chil- 
dren, would like them to be different from them- 
selves in some ways. If, you had a daughter how 
would you like her to be different from you? 

1. Does not want child to be different. 

2. Wants child to be different. 

Many women feel that they are not as good 
wives as they would like to be. Have you ever 
felt this way? Have you felt this way a lot of 
times or only once in a while? 

1. Never. 

2. Once in a while. 

3. A lot of times. 

Many women feel that they are not as good 
mothers as they would like to be. Have you ever 
felt this way? Have you felt this way a lot of 
times, or only once in a while? 

1. Never. 

2. Once in a while. 

3. A lot of times. 


For present purposes, the summed scores on these 
questions were divided into as near equal thirds as 
possible. 


Female Social Characteristics. Criteria 


The social situation used in the present analysis 
concerns the current work status of the married 
women in this subsample dichotomized into those 
who are full-time housewives and those who hold 
full-time jobs outside their homes. As in the analy- 
sis of the men's data, we are assuming that these 
differential life experiences provide dissimilar frames 
of reference for the evaluation of the consequences 
and possibilities of behavior in different work set- 
tings and for the expression of motivational striv- 
ings in fantasy. 
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Female Pictures 


The primary comparison points of this part of 
the study were the different pictures used to elicit 
achievement imagery for the women. The construct 
validity of women's achievement imagery scores de- 
rived from pictures about a "career" versus the home 
setting are compared. Three of the pictures used to 
measure motivation for Women showed this distinc- 
tion. One, Picture 1, suggests the career setting for 
many women: it is very oftem seen as a medical 
or research laboratory. The other two pictures ana- 
lyzed here are generally perceived as depicting 


to women at work at home. The pictures are: 


Picture 1: Two women standing by a table and 
one woman is working with test tubes. 

Picture 4: A woman kneeling and applying a cover 
to a large chair. 

Picture 3: Two women preparing 
kitchen. 


food in the 


Analysis Procedure 


Thà analysis for the present study using these 
pictures grouped women according to whether or not 
they expressed achievement imagery in the career 
picture and also whether or not they expressed 
achievement imagery in either of the two home- 
making pictures. The prediction that the presence of 
achievement imagery was% positively related to the 
expression of moderate feelings of social inadequacy, 

. was separately tested for housewives and employed 
women using these two achievement imagery scores; 
the significance level of this directional prediction 
was evaluated with a one-tailed test for the 2X 2 
chi-square value. The C statistic is used to estimate 
degree of association. 


Female Results 


There were no significant differences in the 
proportion of housewives and employed moth- 
ers who told stories scored for achievement 
imagery for either set of pictures, although 

' there is some tendency for housewives to more 
often tell achievement imagery stories to the 
laboratory picture (Table 3). The achieve- 
ment cue strength of these pictures is there- 


TABLE 3 


PERCENTAGE OF WOMEN GIVING ACHIEVEMENT 
IMAGERY (By PICTURE AND WORK STATUS) 


Picture 1 Picture 4 or 5 

Achieve- Laboratory Home 

, ment 

imagery 

Employed | Housewife Employed | Housewife 

Present 33 47 57 57 
Absent 67 58 43 43 

Total N 30 148 30 148 
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TABLE 4 


ACHIEVEMENT IMAGERY AND EXPRESSION OF MODERATE 
SOCIAL INADEQUACY IN WOMEN 
(Bv PicTURE AND WORK Sratus) 


Achievement imagery | 
Picture and 2 fol 
work status x 
Present | Absent 
1, Laboratory 
Employed | 208 30 <1 47 
(10) | Qo 
Housewife 35 22 2.60* 23 
(69) (79) 
4or 5, Home 
Employed 24 31 «1 13 
S (17) (13) 
Housewife 32 22 1.20 446 
(85) (63) 


s Entries are the percentage of women in each achievement 
imagery group Dm Xen moderate inadequacy. 

*p «.10. 
fore assumed to be approximately equal for 
the two subgroups of women. 

The validity of any of the women’s achieve- 
ment imagery scores was not strongly sup- 
ported by the present data (Table 4). None 
of the relationships between achievement 
imagery and expressed feelings of social in- 
adequacy are highly significant. However, for 
housewives, scores based on both sets of pic- 
tures are in the right direction, with a greater 
tendency for the laboratory picture to be 
more effective than the housewife pictures. 
Stories told to the laboratory picture by em- 
ployed women if anything, reflect imagery 
from people who should be considered low in 
achievement motive. 


Discussion of Female Results 


Although the results are not very sharp 
there is some tendency for them to follow the 
same pattern that the results for the males 
did. Stories told about an unfamiliar setting— 
a career setting for housewives—produced the 
most valid achievement-motive scores. 

Certain specific limitations of the study for 
women should be noted as possible contribu- 
tors to the more equivocal results here than 
with men. The sample of working women was 
much smaller than any other subgroup of men 
or women. Perhaps more importantly, the 
designation of familiar and unfamiliar pic- 
tures for housewives and employed women 
was not entirely clear-cut. The fact that all 
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women are confronted with some aspects of 
the housewife role may account for why in 
neither group the scores from housewife pic- 
tures produced valid results. The picture of a 
laboratory scene designated as more similar 
to the employed women's work setting than 
the housewives! is not specifically similar to 
the type of work most of the employed women 
do. The majority of them are employed in 
clerical and sales occupations. The major 
drawback of this analysis is that unlike the 
male study, this was not an attempt to retest 
a hypothesis directly supported by previous 
work with similar population and operations. 


CONCLUSIONS 


From the results of our analyses, we may 
tentatively conclude that thematic appercep- 
tive assessment of achievement motive re- 
quires consideration of the potential interac- 
tion between the nature of the picture stimuli 
and the subjects! subcultural ties. The same 
physical stimuli appear to elicit fantasy with 
different predictive value for different groups 
of subjects. 

In the present instance, the presentation of 
pictures of familiar work settings yielded less 
valid indices of achievement motive than did 
less familiar ones. These results are inter- 
preted to mean that the most appropriate 
measure of motives may come from using 
tests that put the person in a situation with 
which he is not intimately familiar. Such fa- 
miliar situations can easily arouse not only 
defensive distortion to confound measurement 
but also can engage complex habitual re- 
sponses that have lost their individual affec- 
tive significance. The researcher, faced with 
the necessity of using a standard set of stimu- 
lus pictures with subjects from varying social 
backgrounds, should consider basing his meas- 
ure of motivation on a series of stimuli that 
display varying levels of familiarity to the 
subjects. The total set of pictures from which 
we selected for the present study was designed 
with such variability in mind, and achieve- 
ment-motivation scores summed across all pic- 
tures were related to occupation mobility 
across subgroups (Crockett, 1960). 

In only one instance is there evidence for 

differential amounts of achievement imagery 
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being elicited by pictures of work settings 
that vary in their similarity to the story- 
teller’s work history. 

Finally, certain limitations of the present 
study design that restrict the generality of our 
conclusions should be noted. We have: inves- 
tigated the effects of pictures only upon the 
validity of one type of motive score, achieve- 
ment motive; suecessfully tested its validity 
by prediction to only one external criterion, 


occupational mobility; considered only one — 


aspect of similarity between subject and pic- 
ture, the motoric or conceptual orientation of 
the work setting; and utilized a very limited 
number of pictures. Despite these limitations, 
the results clearly call attention to an im- 
portant source of potential error in thematic 
apperceptive measures of motivation. 
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The effects of reinforcer (R) and S attitudes toward one another upon verbal 
conditioning were studied using a Taffel-type task. Ss with either positive or 
negative attitudes conditioned while neutral Ss did not. While the main effect 
of R attitudes was not significant, a significant interaction with trials suggested 
that hostile Rs inhibited conditioning. Ss with negative attitudes shifted at- 
titudes, partly as a function of reinforcement, and the shift was significantly 
correlated with conditioning. Aware and unaware Ss, as measured by Spiel- 
berger’s procedure, conditioned equally well. Ss with high desire for rein- 
forcement conditioned significantly more than Ss with low desire. The 
importance of cognitive processes in verbal conditioning was discussed. 


While considerable effort has been devoted 
to demonstrating that conditioning of verbal 
behavior occurs (cf. reviews by Greenspoon, 
1962; Krasner, 1958; London & Rosenhan, 
1964; Salzinger, 1959), only recently has 
interest been focused on the effects of inter- 
personal attitudes upon the verbal-condition- 
ing process. The importance of interpersonal 
attitudes is suggested by the potential rele- 
vance of verbal conditioning to psycho- 
therapeutic phenomena (Ullmann, Krasner, & 
Collins, 1961). In the clinical literature, it is 
almost axiomatic that the attitudes of the 
therapist and patient toward one another play 
a crucial role in the psychotherapeutic proc- 
ess, If verbal conditioning is an important 
mechanism in the practice of psychotherapy, 
then these attitudinal factors ought to affect 
the process of verbal conditioning. 

Several studies have addressed themselves 
to the effects of interpersonal variables upon 
verbal conditioning. Sapolsky (1960) found 
less conditioning when the reinforcer and the 
subject were incompatible than when com- 
patible, Ferguson and Buss (1959) found that 
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EFFECTS OF SUBJECT AND EXPERIMENTER ATTITUDES. 
IN VERBAL CONDITIONING * 3 
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an aggressive reinforcer inhibited condi 
ing of hostile verbs. Significant intera 
effects have been. reported between persi 
ity characteristics (hostility as measured bj 
questionnaire) and various situational : 
ables including sex of the experim 
(Sarason, 1962; Sarason & Minard, 19 
Krasner, Weiss, and Ullmann (1961) m 
ported that hostile-acting reinforcers reduce 
conditioning of emotional words and showé 
that atmosphere had a significant interactit 
effect with awareness. Aware subjects Wi 
hostile reinforcer showed less conditi 
than did aware subjects with a nonh 
reinforcer. Farber (1963) found that faili 
experiences with “a nasty” reinforcer inhibi 
the subject’s awareness; awareness in 
being related to the subject’s conditioning 
The purpose of this investigation was 
study the influence of reinforcer and subj 
attitudes toward one another upon verbal c 
ditioning by experimentally manipulating 
terpersonal attitudes. Furthermore, since Fa 
ber (1963) and Spielberger (1962) report 


tioning and Krasner, Weiss, and 
(1961) found an interaction between à 
ness and atmosphere effects, the interrelat 
of awareness, attitudes, and condition 
were studied. Reinforcers and subjects W 
given a battery of paper-and-pencil tests 
sisting of the California Psychological In 
tory (CPI), the Crowne-Marlowe Need 1 
Social Approval scale, and the Sarason HO 
tility scale (Sarason, 1962). Such mea 
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of personality were obtained in an attempt 
to replicate previous findings (Crowne & 
Strickland, 1961; Krasner, Ullmann, Weiss, 
& Collins, 1961; Sarason & Minard, 1963) 
of relationships between personality traits 
and verbal conditionability. These particular 
measures were selected. as being the most 
promising in yielding stable correlations. 

It was generally hypothesized that the 
more positive the attitudes on the part of 


“either the reinforcer or the subject toward 


the other, the greater would be the condition- 


* ing, Interaction effects were anticipated but 


not specified. It was expected that awareness 
would be related to conditioning. 


METHOD 
Subjects 


Subjects and reinforcers consisted of 216 paid 
male volunteers obtained from local universities and 
colleges. One half served as reinforcers, the remaining 
as subjects. The mean age for subjects was 21 years; 
for reinforcers, 20 years. 

The experimental design was a 3 X 3 X2 factorial 
with repeated measurements. There were three levels 
of subject attitudes, three levels of reinforcer atti- 
tudes, and two levels of reinforcement—present or 
absent. Six subjects and six reinforcers were ran- 
domly assigned to each of the 18 groups. The sole 
„limitation of the randomization process was the 
stipulation that no pair consist of individuals 
previously acquainted. 


` Procedure 


Both reinforcers and subjects were initially given 
the paper-and-pencil test battery. In addition, rein- 
iorcers were instructed as to the nature of verbal 
conditioning, were taught their roles as reinforcers, 
and were given demonstrations and/or practice in 
these roles. The instructions were administered to 
groups of reinforcers no larger than 15. The hypothe- 
ses concerning reinforcer and subject attitudes were 
not conveyed to reinforcers (or subjects) . 

For the second session, reinforcers were given 
a review of the conditioning procedifres and had 10 
practice trials with a research assistant. Reinforcers 
and subjects were then told separately that an at- 
tempt had been made to match them with a partner 
they were sure to like; a procedure also employed 
by Sapolsky (1960). Reinforcers and subjects were 
told, “The psychological test that you have taken 
was for the purpose of being able to match you with 
an experimenter [or subject] whom you are sure 
to like? For the dislike instructions, the following 
sentence was then added: “Unfortunately, we cannot 
always do this.” For the neutral instructions the 
following comment was made: “We weren't able to 
achieve an ideal match for you,” while for the like 
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instructions, the following was inserted: “Fortu- 
nately, we are usually able to do this.” All groups 
then received the following instructions: “To give 
you an objective picture of what sort of fellow he 
is, I am going to give you his personality description 
based on his responses to a test.” For those subjects 
and reinforcers receiving the dislike instructions the 
following written personality description was read: 


This man is of adequate intelligence but appears 
to have a large number of personality mechanisms 
which prevent him from dealing adequately with 
people. More specifically, he is rather markedly 
conceited and tends to be somewhat arrogant in 
his dealings with others. He is insensitive to his 
own failings and weaknesses but eager to point 
out the other fellow’s inadequacies. He handles his 
own deep-seated sense of inadequacy by being 
boastful, although he may on occasion control 
such behavior. In summary, he presents a picture 
of being quarrelsome, irritable, and insensitive, 
probably has much difficulty in making and hold- 
ing friends, and blames others for such difficulties. 
Although he is not particularly “emotionally” dis- 
turbed in the psychiatric sense, he is a most 
difficult man to get along with.” 


The following written personality description was 
read for the neutral instructions: 


This man is slightly above average in intelligence, 
and appears to have a combination of effective 
and ineffective personality traits. He is somewhat 
conceited but usually controls this in his inter- 
actions with others. While he is surprisingly in- 
sensitive to some of his failings, he is quite real- 
istic about others, He sometimes will point out 
weaknesses in others but usually not in a hostile 
way. Basically he feels he is somewhat inadequate 
but does not let these feelings get the best of him. 
In summary, he presents a picture of being moder- 
ately conceited, yet agreeable, varying jn sensitiv- 
ity and probably has only occasional difüculty in 
making and keeping friends. His general level of 
adjustment is fair and he is not hard to get along 
with. 


The like instructions were as follows: 


This man appears to be of adequate intelligence 
and has a number of personality mechanisms 
which enable him to deal very effectively with 
people. More specifically, he is modest and tends 
to be considerate in his dealings with others. He 
is sensitive to his failings and weaknesses, and 
is not prone to point out the other fellow’s inade- 
quacies. Since he feels secure, he does not have 
to become boastful or dependent to raise his self- 
esteem. In summary, he presents a picture of being 
agreeable, easy going, and sensitive, and probably 
has no difficulty in making and holding friends. 
He is quite well adjusted and easy to get along 
with. 


After reading the fake test reports, both subjects 
and reinforcers were asked to rate their partner, 
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TABLE 1 
MEAN ScorES FOR REINFORCERS AND SUBJECTS ON THE EVALUATIVE QUESTIONNAIRE 
on Borg TESTINGS 
= d 
Reinforcement Nonreinforcement 
First occasion Second occasion First occasion Second occasion 
Reinforcer like subject 93.44 94.44 96.11 93.28 
Reinforcer neutral subject 75.28 80.67 a 73.72 75.95 
Reinforcer dislike subject 53.28 77.17 53.33 69.11 
Subject like reinforcer 91.89 88.44 93.17 87.78 
Subject neutral reinforcer 78.39 79.00 71.83 82.11 P 
52.94 84.61 51.11 70.83 


Subject dislike reinforcer 


sight unseen, on an evaluative questionnaire (Davis 
& Jones, 1960; Jones, Gergen, & Davis, 1962). The 
questionnaire was used to assess the efficacy of the 
attitude-inducing instructions. € 

The verbal-conditioning procedure used the Taffel 
task wherein the subjects are presented with a series 
of cards on which is typed a verb and six personal 
pronouns, Subjects using sentences employing the 
first-person pronouns were reinforced by the rein- 
forcer saying “good.” There were 100 conditioning 
and 75 extinction trials. The conditioning trials were 
tape-recorded to insure accurate scoring and to assess 
reinforcer errors. 

Aíter completing the conditioning session, each 
partner again rated the other on the same evaluative 
questionnaire. Subjects were then questioned by a 
research assistant, using procedures specified by 
Spielberger (1962), as to their awareness of the 
reinforcement contingencies. Again following Spiel- 
berger's procedure, each subject was also questioned 
about his desire for the reinforcement. The subject 
was required to indicate whether he wanted the 
reinforcer to say "good," very much; somewhat, or 
did not care much one way or the other. After 
the completion of all procedures, the nature of the 
experiment was explained to all subjects. 


RESULTS 


Mean scores for subjects and reinforcers 
on the evaluative questionnaire for both test 
occasions are presented in Table 1. The mean 
ratings obtained prior to conditioning con- 
form precisely to expectation and are highly 
significant for both reinforcers and subjects 
(F = 123, p< 01; F=141, p<.01, re- 
spectively). For the postconditioning ratings, 
the means for reinforcer attitudes retained 
their ordering and the differences were again 
significant (F = 16, p< .01). The postcon- 
ditioning means for subjects shifted mark- 
edly, however, particularly for the subjects 
in the dislike-reinforcer group. The mean dif- 
ferences were still significant and the shift 


in the ordering of the means as a function 
of reinforcement is reflected in a significant 
Instructions X Reinforcement interaction (F 
= 3.79, p< .05). ' 

To assess the effects of attitudes upon the | 
conditioning process, “I” and “We” responses | 
were combined and served as the dependent 
variable. The analysis of variance is presented 
in Table 2. 

The main effects of reinforcement, subject 
attitudes, and trials were significant, and 
there was a significant interaction effect of 
reinforcer attitudes with trials. Contrary to 
expectations, however, the effects of subject 
attitudes were that while positive attitudes 
facilitated conditioning, so did negative atti- 
tudes. The "neutral? subjects failed to condi- 
tion. Nonreinforced subjects given negative. 


TABLE 2 


ANALYSIS OF VARIANCE OF CONDITIONING TRIALS 
Usinc Bora FIRST-PERSON PRONOUNS 


Source df MS 22 
Between subjects 
Reinforcement (A) 1| 682.31 | 15.46 
Experimenter 2 7.98 | 
attitudes (B) 
Subject attittides (C) 2| 194.75 | 44l* 
AXB 2| 7448 
AXC 2| 134.72 
BXC 4| 25.76 
AXBXC 4 
Error (b) 90 
Within subjects \ 
Trials (D) 4| 80.03 | 14.10" 
DXA 4| 25.76 
DXB 8| 11.47 | 202* 
Error (w) 360 
Total* 539 
is guise interactions were not significant. 


**$ = 01. 
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attitudes tended to emit fewer I and We 
responses than did nonreinforced subjects 
given positive attitudes. Curves for the condi- 
tioning trials when the data are grouped 
according to subject attitudes are presented 
in Figure 1, while comparison of groups on 
the basis of reinforcer attitudes are presented 
"n Figure 2. 
Since subject attitudes shifted during the 
. conditioning session, especially for the rein- 
, forcer-dislike groups, a difference score based 
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on the pre- and postconditioning rating scales 
was computed for the 18 subjects in the 
dislike groups. The shift in attitudes for these 
subjects was significantly correlated with 
conditioning (7, = .59, r = .65). 

Recognizing the possibility that the inde- 
pendent variables may differentially effect the 
emission of I and We responses, separate 
analyses of variances were computed for each 
response class. The variables having a sig- 
nificant main effect on the conditioning of 
I were reinforcement (F = 13.90, p < .01), 
subject attitudes (F = 3.86, $ < .05), and 
trials (F = 10.84, p < .01). The significant 
interactions were subject attitudes with trials 
(F = 3.12, p< 05) and trials with rein- 
forcement (F = 2.74, p < .05). The learning 
curves for the response I when data are 
grouped by subject attitudes toward the re- 
inforcer, are quite similar to the data for I 
plus We presented in Figure 1. 

Only the main effects of reinforcer attitudes 
(F=3.15, p< .05) and trials (F = 3.56, 
p < 01) were significant when the response 
class We was the dependent variable. The 
learning curves for the response class We are 
presented in Figure 3. 

Performance during the extinction trials on 
the three above-mentioned dependent varia- 
bles was analyzed by the analysis of vari- 
ance, For the dependent variable I plus We, 
the main effects of reinforcement and trials 
were significant (F= 10.72, p< (Ole uae: 
= 4.88, p< 01, respectively). The former 
main effect reflects group differences devel- 
oped during the conditioning period, while 


Ww Reint. 
E RLke$ o— 
a R "Neutral" S 5— 
R R Dislike 5 o— *--- 
ü 
E 20 
3 
i 0 
[s 
= 
o r T 1 
Reno 1 2 3 4 5 


BLOCKS OF 20 TRIALS 
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three levels of reinforcer attitudes, We response 
only. 
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the main effect of trials reflects the subject's 
increased usage of I plus We over the extinc- 
tion trials. There was also a significant inter- 
action effect of subject attitudes with rein- 
forcement (F = 3.78, p < .05). The latter ef- 
fect was due to the "reinforced extreme atti- 
tude groups (like and dislike) employing 
more I plus We responses than the neutral 
groups, while the nonreinforced dislike rein- 
forcer groups used the fewest I plus We 
responses. 

In analyzing the use of I alone responses 
during the extinction period, there were sig- 
nificant main effects of reinforcement (F 
=9.97, p< 01) and trials (F= 5.13, $ 
<.01) and a significant interaction effect 
of reinforcement with subject attitudes (F 
= 4.56, p < .05). Subjects increasingly em- 
ployed ‘the T response as a function of a num- 
ber of extinction trials, this being particularly 
true for subjects with neutral attitudes 
towards the reinforcer. 

In analyzing the dependent variable We 
for the extinction trials, only the main effect 
of trials was significant (F = 7.41, p < .01). 
This again reflects the subjects’ increasing 
usage of the We response over the course of 
extinction trials, 

The two authors independently judged the 
awareness questionnaires for the presence or 
absence of awareness with 90% agreement. 
Thirty-eight subjects were judged aware and 
16 unaware, All 38 subjects were aware of the 
reinforcement contingencies with the response 
I, while 26 were aware of these contingencies 
for both I and We. Awareness failed to corre- 
late significantly with the summed I plus We 
dependent variable. To further test whether 
unaware subjects conditioned, a Wilcoxon 
matched-pair sign-rank test (Siegel, 1956) 
was used in which performance of unaware 
subjects (V = 16) on the last 20 trials was 
compared with their own performance on the 
first 20 trials. The results were significant 
(5 < 01). 

Further analysis revealed no difference in 
the conditioning of I responses for aware and 
unaware groups. However, subjects unaware 
of the reinforcement contingencies for We 
responses, including 12 subjects who showed 
awareness concerning I (NV = 28), emitted 

significantly fewer We responses than did 
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aware subjects on the last 20 trials (x: A 
= 6.01, df = 1, ? < .02). Furthermore, these f 
“aware” subjects (N= 28) showed a signifi- | 
cant increase in the emission of We state- 
ments from the first to last 20 conditioning 
trials (Wilcoxon matched: -pair sign-rank test, 

$ < 05). 

Since few subjects stated they wanted tha $ 
reinforcement "very much,” the “very much” 
and “some” groups were combined (N —21)2 
and compared with subjects indifferent to the 
reinforcement (N = 33). High-desire subjects 1 : 
conditioned significantly more than low-desire | 
subjects (x? = 4.99, df = 1, p < .05). Learn- 
ing curves comparing aware and unaware sub- 
jects and subjects with high and low desire - 
for reinforcement are presented in Figure 4. d 

There was no relation between subject atti- 
tudes and desire for reinforcement, Since sub- 
jects holding extreme attitudes, positive or 
negative, conditioned similarly, these two 
groups were combined (N = 36) and com- ’ 
pared with the “neutral” group (JN = 18) as 
to their desire for’ reinforcement. A signifi- 
cantly greater number, of subjects holding 
extreme attitudes than neutral” subjects in- 
dicated that they wanted the reinforcer to | 
= “good,” “very much," or "somewhat" | 
(x? = 3.14, df = 1, p< .05). 

No personality measure correlated signifi- 
cantly with the sum of I and We responses. 
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DISCUSSION 


The general hypothesis that interpersonal 
attitudes effect verbal conditioning was sup- 
ported, particularly for subject attitudes. The 
nature of the effects, however, are complex. 
The subjects with positive and (initially) 
negative attitudes towards the reinforcer 
‘conditioned, That positive affect toward the 
reinforcer on the subject’s part produces 
greater conditioning than “neutral” affective 

- states essentially confirms clinical notions as 
well as the empirical findings of Sapolsky 

(1960). The findings that negative subject 
-attitudes lead to as much conditioning as 

positive attitudes was contrary to expecta- 
tion. Previous work by Sapolsky and Krasner, 
Weiss, and Ullmann (1961) demonstrated 
that hostile atmospheres worked to inhibit 
conditioning, while the work of Sarason and 
his colleagues (Ganzer & Sarason, 1964; 
Sarason, 1962; Sarason & Minard, 1963), 
using questionnaires measure of hostility, 
also suggests an inhibiting process. It should 
be noted, however, that Sapolsky found that 
his “low-attraction’® subjects evidenced the 
effects of reinforcement during the extinction 
trials when the experimenter was removed 
_from the subject’s presence. Sapolsky con- 
cluded that subjects who were incompatible 
with the reinforcer suppressed the effects of 
* reinforcement until they moved from the re- 
inforcer’s presence. While the effects of the 
present study and those of Sapolsky’s are not 
totally inconsistent, the evidence of the effect 
of reinforcement in the present study was 
manifested under very different conditions. 
That is, conditioning occurred during the 
conditioning trials and in the presence of the 
reinforcer, Since Sapolsky used only one ex- 
aminer throughout the conditioning of all his 
subjects, while the present study employed a 
new examiner for each interaction, one might 
speculate that the freshness or the enthusiasm 
of examiners during a one-shot interview 
study are enough to undo the potential effects 
of dislike instructions. 

It is possible that inducing extreme atti- 
tudes serves to increase the subjects general 
drive level which in turn facilitates the learn- 
ing of simple tasks (Spence & Farber, 1953; 
Spence & Taylor, 1951). Motivation or drive 
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strength as measured by the Taylor Manifest 
Anxiety scale, however, has not shown con- 
sistent relationships to verbal conditioning 
(Buss & Gerjuoy, 1958; Rogers, 1960; 
Matarazzo, Saslow, & Pareis, 1960). The 
possibility does remain, of course, that experi- 
mentally manipulated drive may have dif- 
ferent effects on learning under these experi- 
mental conditions than drive as inferred on 
the basis of test scores. 

A possible explanation of the unexpected 
result that both positive and negative subjects 
conditioned is suggested by the further find- 
ing that the subjects holding extreme atti- 
tudes toward the reinforcer differed signifi- 
cantly from neutral subjects with regard to 
their desire for reinforcement. Perhaps the 
induction of extreme attitudes alerts the sub- 
ject to interpersonal cues, making him more 
sensitive to reinforcement which in turn in- 
fluences negative attitudes towards the rein- 
forcer. Such a subject would condition more 
and be more eager for reinforcement as was 
found. 

Reinforcer attitudes also affected subjects 
verbal behavior although not as strongly as 
subject attitudes. Reinforcer attitudes yielded 
a significant main effect on We which ap- 
peared to result from an inhibition of this 
response for the dislike groups. There was 
also a significant interaction of reinforcer 
attitudes with trials when I and We were 
summed but this is likely due to the above- 
mentioned relation between reinforcer atti- 
tudes and We responses. The mechanisms by 
which the reinforcer exerts influence cannot 
be determined from the present data. The 
possibility that the effects were produced by 
selective errors in reinforcement, however, 
can be ruled out. Reinforcers committed very 
few errors and these were evenly distributed 
across groups. 

The attitude shifts over the course of the 
conditioning session, particularly for subjects 
given dislike instructions, have both con- 
ceptual and methodological importance. Con- 
ceptually, it is important to note that rein- 
forcement may not only modify specific and 
limited verbal behavior, but also may affect 
the subjects’ perceptions of the reinforcer. 
Furthermore, shifts in attitudes are highly 
correlated with conditioning for subjects with 
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initially negative attitudes. The data indicate 
that the shift in attitude rather than absolute 
level of attitude was the important factor since 
initially negative-attitude subjects had post- 
conditioning ratings quite similar to the 
neutral-attitude subjects. It may be that the 
negative-attitude information induced appre- 
hension about the forthcoming interaction 
and that the reduction of this apprehension 
by the reinforcers’ benign behavior and verbal 
reinforcements facilitated conditioning. While 
it may be that a verbal reinforcement does 
lead to the modification of particular response 
classes, it is also clear that verbal reinforce- 
ment, occurring as it does in an interpersonal 
context, has measurable effects on the inter- 
personal perceptions of the participants. 

Methodologically, such shifts make it dif- 
ficult to study the effects of negative attitudes 
on verbal conditioning, since the attitudes 
become more positive in part as a result of 
receiving reinforcements. The inclusion in this 
study of a postconditioning attitude measure 
and the resulting attitude changes points 
up to the need for studies of interpersonal or 
atmosphere variables to obtain postcondi- 
tioning measures. 

While the present results give support to 
Spielberger's (1962) contentions concerning 
the importance of cognitive processes in 
verbal conditioning, awareness was correlated 
with conditioning only for We responses. 
Whereas Spielberger also found desire for the 
reward to be related to conditioning, its im- 
portance was clearly secondary to that of 
awareness, The opposite was the case in the 
current study. While Spielberger has rather 
consistently failed to find that unaware sub- 
jects condition, the present study did find 
conditioning for unaware subjects. It is pos- 
sible that the 75 extinction trials and the 
retesting on the evaluative questionnaire prior 
to assessing subjects’ awareness distorted the 
reports obtained from the subject and led to 
misclassification of subjects. The awareness 
questionnaire is, however, quite probing and 
on the face of it appears more likely to sug- 
gest awareness than to be insensitive in its 
detection. 

Awareness was not related to reinforcer or 
subject attitudes. This finding is somewhat 
in contrast to Farber's (1963) data which 
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indicated that the imposition of a failure., 
experience by a “nasty” reinforcer decreased | 
reported awareness. Krasner, Weiss, and Ull. - 
mann (1961), like the present investigators, 
failed to find any relationship between aware- 
ness and atmosphere effects. 

All personality measures, including all 
scales of the CPI, used in this study failed to 
correlate significantly with conditioning, 
Thus, there was no confirmation of the re- , 
ported relationship of the CPI scale achieve- . 
ment via independence and conditioning | 
(Krasner, Ullmann, Weiss, & Collins, 1961), © 
need for social approval and conditioning | 

| 


(Crowne & Strickland, 1961), or of hos- 
tility as measured by Sarason's scale, and 
conditioning. 

The failure to confirm the findings of 
Sarason and his colleagues (Ganzer & Sara- . 
son, 1964; Sarason, 1962) may be due to the 
present study's failure to include the vari- 
ables (e.g. sex of the experimenter) which 
hostility was found to interact with in affect- | 
ing verbal conditioning. Further, the design | 
of the present study dqes not permit ana- 
lyzing the effects of hostility by analysis of 
variance, as was done in Sarason’s studies. | 

In conclusion, the present study provides | 
further evidence that interpersonal attitudes 
effect verbal conditioning. Furthermore, such 
attitudes are, in turn, influenced by verbal | 
reinforcement. The data failed to support 
Spielberger’s findings that awareness is highly 
correlated with conditioning. The present 
study indicates that an empirical attack upon 
the effects of interpersonal attitudes, the . 
mechanisms by which the effects are pro- 
duced, and the types of subjects and response 
classes affected by such attitudes would be 
quite fruitful. As Sarason and Minard (1963) 
have indicated, the verbal-conditioning para- 
digm may be quite useful in the study of 
social and personality variables. 
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LOCUS AND ORIENTATION OF THE PERCEIVER (EGO) 


UNDER VARIABLE, CONSTANT, AND NO 
PERSPECTIVE INSTRUCTIONS 


THOMAS NATSOULAS 
University of California, Davis 


An experiment is described on the locus and orientation of the perceiver (ego). 
The basis of inference concerning this theoretical concept is a procedure in- 
troduced by Krech and Crutchfield involving responses to tracings on the 
skin. The tracings chosen are ones that can be perceived from either an 
external (“objective”) perspective, that of E, or an internal ("subjective") 
perspective, that of S in a sense "looking out." Letters and line figures were 
traced on the left side of the head. Under conditions of no perspective in- 
structions, Ss at first responded on the average equally often from an internal 
and an external locus. With trials they tended to adopt consistently one or the 
other locus, with the external one being favored. It was found also that under 
conditions of constant perspéctive instructions, i.e. where the same perspective 
was required over a series of trials, Ss were able to respond equally accurately 
from the 2 loci (about 8596 responses correct), although latencies were greater 
under external instructions. Variable perspective instructions, ie. where S does 
not know from which perspective to respond until instructed on each trial, 
yielded large differences; internal instructions gave better and faster perform- 
ance. An interpretation of these results is suggested consistent with the con- 
ceptual scheme proposed by Natsoulas and Dubanoski, a conceptual scheme 
having as a focal concept the perceiver or ego, its possible loci and orientations. 


The inferred locus from which perception 
takes place has received occasional but not 
intensive attention. Hebb (1960) has spoken 
of the possible need in psychological theory 
for an ego or perceiver, meaning by this an 
organization of mediating processes, which 
has effects of a very specific, and at times 
unusual, kind. He refers to instances of pilots’ 
flying at high altitudes and subjects’ serving 
in isolation studies, reporting “the fantasy of 
a separate mind.” Of relevance to the present 
experiment is the fact that these subjects and 
pilots sometimes report a shift in perceptual 
locus, such that they seem to be perceiving 
themselves (their bodies) from an external 
vantage point. Such observations provide sup- 
port, albeit in exaggerated form, for what is 
hypothesized by Mead (Strauss, 1956) to 
occur habitually in social situations, Mead 
presents an analysis of social interaction based 
on the capacity of men to form perceptual 
objects of themselves from loci corresponding 
to those of other people or of a generalized 
other. Hebb claims that “the failure of ex- 
perimental psychology to deal with the ‘I’ or 
‘ego’ is a cause of its continued inadequacy 
with regard to clinical matters [p. 740].” In 
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the present experiment an attempt is made to 
assess the locus which the perceiver takes ) 
spontaneously in responding to cutaneous- l 
form stimulation, the ability to maintain an 
external perspective as compared with an in- 
ternal perspective with respect to such stimu- 
lation, and the ease and accuracy with which 
shifting to each perspective can occur upon 
immediate instructions. In all instances the 
other whose perspective is taken is the experi- 
menter. 

In their Elements of Psychology, Krech 1 
and Crutchfield (1958) describe “an experi- 
mental demonstration of the fact that people 
can assume different perspectives for the self 
in relation to the body [p. 205]." A script 
capital letter E was traced on the forehead of ” 
202 students, 76% of whom reported it as 4 
3 while the rest reported E, the women report- 
ing the symbol as 3 more frequently than the 
men, Krech and Crutchfield interpret theif 
demonstration as showing the perspective — 
“outside” or “inside,” from which the person 
tends to perceive, [ 

Using the symbols 5, d, b, and q, Natsoula ^ H 
and Dubanoski (1964) tested a conceptual N 
scheme which proposes two mediating detet- 
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. . minants of how a letter drawn on a person's 
head is perceived: the normal orientation of 
the perceiver (ego), and the magnitude of 
change in normal orientation necessary for 
- perceiving the letter from an internal locus. 
- They assume that the greater the change in 
orientation necessary for perception from an 
! internal locus, the greater the likelihood of 
— perception from an external locus, An experi- 
_ ment was performed «which varied the place 
on the head where the letter is drawn (fore- 
E head, left side, right side, or back) and the 
? head orientation (straight ahead, facing left, 
or facing right). Within each condition of 
"head orientation predictions were made and 
confirmed with respect to perspective taken in 
. responding to tracings on the head at the 
> various places, This required the additional 
assumption concerning the normal orientation 
of the perceiver that it is straight ahead when 
the person is facing straight ahead and part 
way to the left or right when the head is 
' turned in that direction. 

Another factor affecting perspective taking 
has been demonstrated by Krech and Crutch- 
field (1958); they found that given a set the 
person could perceive the same symbol from 
either an internal or external locus. The ex- 

. periment reported here is related to this ob- 
servation in that it explored mainly the effects 
` of instructions to take a particular perspec- 
tive. All tracings were done on the left side of 
the head with head facing forward; these are 
the conditions under which Natsoulas and 
Dubanoski (1964) found frequencies of spon- 
taneous reports from an internal and external 
perspective to be nearly equal. 


METHOD 
Subjects 


Forty-eight men and an equal number of women, 
undergraduates at the University of Wisconsin, 
served individually as subjects. They volunteered to 
participate in exchange for points on their final ex- 
aminations in the introductory psychology course. 
Every subject was tested by the same male experi- 
menter in the same experimental room. 


Procedure 


Upon arrival the subject was seated facing a bare 
. wall while the experimenter sat within arm's reach 
at right angles to the subject facing the left side of 
his head. The time between trials within any series 
Was about 5 seconds, just sufficient to record the 
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subject's response, its latency, and to reset the stop- 
watch. The three parts of the experimental session 
were continuous with no pause between parts. 

Part I. In Part I the following instructions were 
read: 


In the first part of this study, I will trace with 
my finger some letters of the alphabet on your 
temple. You will not see the tracings, of course; 
you will feel them on your temple. The letters I 
will trace are these four [the experimenter showed 
the letters printed by hand on a 3 X 5 inch index 
card and left the card visible through the re- 
mainder of these instructions], d, p, b, and q, all 
small, print letters. I will trace them in the same 
shape as they have on the card. Each time I make 
a tracing, I will do it only once. In the course of 
this first part of the experiment each letter will 
be traced several times on your temple, but each 
time it is traced it will be traced only once. 

Yotr job in this part of the experiment is to 
give me an indication of how you experience the 
tracings that I make. You can do this by stating 
which of the four letters seems closest to the 
experience you have as a result of a tracing, that 
is, which letter fits your experience best. The idea 
is not to think about what is happening. Just tell 
me which letter you experience, which one comes 
to mind as I trace. 

Now please shut your eyes and keep them shut 
until the end of this part of the study. Face 
straight toward the wall. As soon as I trace some- 
thing on your temple tell me as quickly as you 
can what letter you experience. My interest is in 
your immediate reaction, not in your thought 
processes. I will be timing your responses. Ready? 
Here is the first one. 


The letters were traced with the forefinger in a 
single motion beginning with the end of the stem 
of each letter, All tracings were made on the left 
side of the subject's head just in front of the ear. 
The 16 trials in Part I each consisted of a single 
tracing and response. Four subjects, two of each 
sex, experienced the same order of tracings; 24 
orders in all were used. These orders began with 4 
trials which varied in such a way as to exhaust all 
possible orders of the four letters. With this in mind 
the order for all subjects can be described as 1, 2, 3, 
4, 3, 2, 4, 1, 3, 1, 4, 2, 1, 2, 4,3 where 1, 2, 3, and 4 
represent the four letters. Latencies of response were 
measured by stopwatch to the nearest tenth of a 
second, The watch was started just as the tracing 
began and was stopped when the subject’s oral re- 
sponse was heard. 

Part II. Part II consisted of two successive series 
of 16 trials each (Part IL, and Part IIs). In one 
series the subject was asked to assume an external 
perspective and in the other an internal perspective. 
Half of the men and half of the women received the 
external instructions first, the other halves the in- 
ternal instructions first. The division by two of each 
sex group was determined by assigning 24 subjects 
to each group in Part II representing all 24 orders 
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of stimuli in Part I. The orders of presentation of 
stimuli in Part II were the same ones as used in 
Part I except that no subject received in the three 
series of 16 trials any order of tracings more than 
once. 

The instructions for Part II, were as follows: 


I want you to think of this piece of plastic as 
the side of your head [the experimenter showed 
the subject an unmarked sheet]. Suppose I draw 
a 3 on it. This is clearly a 3 from my point of 
view and at the moment from yours. But if we 
turn the sheet over, it looks like I have drawn a 
script capital letter e. Similarly if I were to draw 
a script capital e on your temple, it would appear 
as such from my perspective; from yours it would 
look like a 3. There are, then, two perspectives 
from which you can perceive something drawn on 
your head; therefore, when I ask you what is 
being traced, there are two correct answers, de- 
pending on your point of view. Thus when I 
draw a p on your temple you might call it a $ 
from my point of view or a q from yours [re- 
peated for q]. Similarly with b and d. These, too, 
are mirror images of each other. 

In the next part of the experiment, I will trace 
the same letters on your temple in exactly the 
same way as before. Everything will be the same 
except that this time you must respond to every 
tracing from my (your) perspective, that is, you 
must tell me what each letter is from my external 
point of view, as it looks to me (ie, you must 
tell me what the letter is from your internal point 
of view, as it seems to you "looking out"). 

Now please shut your eyes and keep them shut 
until the end of this next part of the study. Con- 
tinue to face straight ahead. 

As soon as I trace something on your temple 
tell me as quickly as you can what letter it is 
from my external (your internal) perspective. I 
will again be timing your responses. Ready? Here 
is the first one. 


For Part IIs the last three paragraphs were read 
again with the appropriate changes for the other 
perspective. 

There were then four experimental groups in 
Part II comprised of the combinations of sex and 
order in which perspective instructions were given. 

Part III. In Part III the following instructions 
were read to the subject: 


In the next part I will continue the tracing. 
Instead of tracing letters, however, I will trace 
some very simple, meaningless, straight-line fig- 
ures. Because of this change in what will be 
traced it will be necessary for you to draw the 
figures rather than identify them verbally [the 
subject was given at this point materials for 
drawing]. Again you will be required to respond 
from either an external or an internal perspective. 
Just before I trace each figure I will say "internal" 
or "external? This means draw the figure from 
the perspective indicated. The figures are very 
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simple, but they are not symmetrical and therefore . 
they do appear differently depending on the per- 
spective you take. 

As before I will be timing your response. Begin 
to draw as soon as you can after I have finished 
tracing. Keep your eyes shut as you draw and 
throughout the next part of the experiment. Draw- 
ing with your eyes closed will be easy since these 
are very simple figures. Ready? Here is the first 
one. 


All subjects received 24 tracings each different from 
every other (see Stimuli). Half the subjects were 
given one order of internal and external instructions: 
I, E, E, I, E, I, I, I, E, I, E; E, I, I, E, I, E, E, E; 
I, I, E, I, E; the other half received the same order 
except with I's and E’s substituted for each other. 
Another variable in Part III was the time between 
the instruction for a trial and the tracing. Half the 
subjects were given 5 seconds to adopt the perspec- 
tive indicated by the experimenter for that trial; 
the other half were instructed simultaneously with 
the beginning of a tracing. Thus there were four 
experimental groups based on manipulations in Part 
III, resulting from the combinations of two orders 
of instructions and two intervals between instruc- 
tions and tracings. Assignment of subjects to each 
experimental group in Part III was based on equal 
representation from each, experimental group in Part 
II. 


Stimuli 


The series of figures for Part III consist of 45- 
degree and 90-degree angles drawn beginning at one 
corner of an imagined square, with vertex at an- 
other corner, and ending at a third. The 24 tracings 
used exhaust the possibilities for beginning at every 
corner of the square, going to every other corner, 
and from there to one or the other of the remaining 
corners, all in straight lines. On each trial in Part 
II all drawings were equally often represented 
across subjects in each experimental group. 

All scoring for all parts of the experimental ses- 
sion is in terms of external and internal perspective, 
For example, when p is drawn, a response of p or b 
(much the rarer) is considered external, a response of 
d or q (also rare) considered internal. Similarly in 
the case of the figures for Part III, the presence Or 
absence of right-Jeft reversal is the basis for scoring 
a response as internal or external; other attributes 
of the subject's drawings are ignored. 


RESULTS 
Part I 


The proportion of responses from an in- 
ternal perspective (I responses) was deter- 
mined for the four consecutive blocks of four 
trials for each subject. These proportions were 
transformed to angles by arc sine, and the 
resulting scores submitted to an analysis of 
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variance with sex and blocks of trials as inde- 
pendent variables. 
Only the main effect of blocks is statisti- 


- cally significant (F = 3.78, df = 3/282, p< 


.025) with proportions of I responses for four 
successive blocks of .497, .451, .404, and .404. 
In Figure 1 the kind of: changes over blocks 
can be seen more exactly. It shows the number 
of subjects giving various fréquencies of I 
responses within each block. A shift in the 
distribution from symmetry around a mode of 
two I responses in the first block toward bi- 
modality at none and four in the last block is 
evident. As trials proceeded the subjects ap- 
pear to have adopted one or the other per- 
spective as a consistent basis for response; in 


. the first block 78 of the 96 subjects responded 


from both perspectives, in the second their 


. number reduced to 60, in the third 43, and in 


the final block 41 responded from both an 
internal and an external perspective. Most of 
the shift to a consistent perspective was to 
the external point of view. 

The mean latency for each block for each 
subject was submitted to an analysis of vari- 
ance; only blocks approaches significance (F 
= 2.47, df = 3/282, p < .10). When the mean 
latencies are converted to logarithms to reduce 
skewness, blocks is statistically significant (F 
= 7,33, p < .001). The mean latencies for the 


» four blocks are, respectively, 2.46, 2.42, 2.29, 
„and 2.22 seconds. This improvement in per- 


formance appears to be due to reduction in the 
time for an E response; if each response is 
treated as equivalent to every other, that is, if 
the variable of subjects is ignored, in calculat- 
ing the mean latencies for the two perspec- 


ES FROM 
AN INTERNAL PERSPECTIVE 
Fic. 1. Number of subjects giving various frequen- 
cies of I responses within each block of Part I. 


BER OF RESPO! 


35 
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tives, a reduction in the time for an E re- 
sponse is evident (2.51, 2.45, 2.25, 2.15) 
while the mean latencies for I- responses 
hardly change (2.42, 2.38, 2.36, 2.32). 

The apparent differences in rate of improve- 
ment for the two kinds of responses may be 
related to the fact that the external perspec- 
tive became more favored with trials; in the 
last two blocks 8696 of all E responses were 
given by subjects who gave three or four E 
responses in a block (see Figure 1), while the 
value for I responses is 73. Thus, in the last 
two blocks I responses appear to have been, 
more frequently than were E responses, lapses 
from the adopted perspective occurring per- 
haps as a result of an unstable locus or dis- 
tracting variables. 


Part II 


For very block and each subject the pro- 
portion of responses correct was calculated, 
the proportions converted by arc sine to 
angles, and the resulting scores submitted to 
an analysis of variance containing a 2 X 2 
Latin square for repeated measures, order of 
test with a particular perspective instruction 
being the Latin square factor, Also contained 
in the analysis were the independent variables 
of sex and blocks. 

Blocks is the only significant effect (F 
= 3.69, df = 3/276, p < .025). The propor- 
tions of correct responses in the four blocks 
were, respectively, .820, .855, .857, and .869. 
The proportion of I responses under I instruc- 
tions was .859, of E responses under E in- 
structions, .842. The absence of a significant 
effect of orders suggests that for many sub- 
jects practice in taking one perspective does 
not affect how accurately they respond from 
the other perspective. 

An analysis of variance of this same type 
was performed on the mean latencies for 
every block for each subject. Blocks, orders, 
and instructions are the significant effects (re- 
spectively, F = 22.98, df = 3/279, p < .001; 
F = 1021, df = 1/92, p < 005; F = 18.09, 
dj = 1/92, p < .001). The effect of blocks is 
associated with a diminution in mean laten- 
cies: 2.11, 1.96, 1.89, and 1.83 seconds. The 
I instructions yielded faster responding (M 
latency = 1.84) than did the E instruction 
(2.05). The significant effect of orders indi- 
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cates that the mean latencies in Part II, 
were longer (2.03 seconds) than in Part Ip 
(1.87 seconds). An analysis of variance of 
the scores of Part II, alone yielded instruc- 
tions and blocks as significant (F — 10.38, df 
=1/92, p < .005 and F = 13.80, df = 3/276, 
p< .001, respectively). The mean latency 
under E instructions was 2.19 seconds in 
Part II, and under I instructions it was 1,87 
seconds. In Part IIg the corresponding values 
were 1.92 and 1.81, which are not signifi- 
cantly different by analysis of variance. 


Part III 


The proportion of correct responses for 
each block of six trials with the same instruc- 
tions was determined for each subject. An 
analysis of variance was performed on these 
proportions after they were converted to 
angles by arc sine. The main effects for this 
analysis were blocks, instructions, orders (of 
presenting instructions), time intervals (be- 
tween instructions and tracing), sex, and 
groups (in Part II having one or the other 
order of instructions). 

Instructions was a significant main effect 
(F—11L74, df= 1/80, p< .001). I in- 
structions gave more correct responses than 
E instructions; .890 is the proportion of 
correct responses under I instructions and 
.501 under E instructions. This advantage of 
I over E instructions held for both blocks 
of trials. In the first half of Part IIT the 
respective proportions correct were .884 and 
-458; in the second half, .896 and .543. The 
interaction of Instructions X Blocks is not 
significant (F = 2.90, df = 1/80, p< .10), 
but there is a significant main effect of blocks 
(F = 5.57, df = 1/80, p < .025). 

The interaction of Instructions x Groups is 
significant (F = 5.39, df = 1/80, p < .025). 
This interaction reflects the greater accuracy 
in responding in terms of a particular per- 
spective when it was the last instructed 
perspective in Part II. This is particularly the 
case under E instructions. E instructions in 
Part IIg result in proportions of .865 and 
.561 correct for I and E instructed trials, 
respectively; I instructions in Part IIg result 
in proportions of .915 and .440 correct. 

Finally, also significant is the main effect 
of orders (F = 5.55, df = 1/80, p < .025). 
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. while the order beginning with an E instruc 


The order in Part III which begins with 
trial on which there is an I instruction re 
sults in a total proportion correct of .73] 


tion yields a proportion of .663. 

An analysis of variance of the same typ 
was also performed on the mean latencies fo 
every block of six trials for each subject 
Blocks and instructions are the significa 
effects (respectively, *F = 12.72, df = 1/80 
p<.001 and F=40.15, df= 1/80, | 
< .001). The two blocks under the I in 
structions gave mean latencies of 2.93 ant 
2.70 seconds; under the E instructions, 3.30 
and 3.14 seconds, 


Discussion 


The following interpretation of the resi 
is selected for its consistency with the con 
ceptual scheme proposed by Natsoulas al 
Dubanoski (1964). 

Part I provided data under instruction! 
which did not require explicitly a particula 
perspective, The instructional emphasis of 
subject’s experience was meant to free hil 
from concern with accuracy, thus permitt 
study of his spontaneous perspective taking 
Natsoulas and Dubanoski found for the same 
head orientation, place on head where letti 
is drawn, and instructional emphasis a pio 
portion of I responses of .490, a value nearlj 
equal to that found here in the first bloc 
(.497). They argued that if the change 0 
orientation necessary for perceiving from a Al 
internal locus is more “perceptually effortfub 
than a change in the locus of the perceivi 
from internal to external, the number off 
responses would exceed the I responses. In tà 
present study, with the subject facing forwall 
and letters drawn on the left side of tb 
head, the change in orientation would be} 
matter of 90 degrees, on the assumption (sup 
ported by the findings of Natsoulas all 
Dubanoski) that when facing forward tà 
normal orientation of the perceiver is straig 
ahead. The frequencies were almost equal fo 
I and E responses in the first block of tria 
and in the previous study under the sam 
conditions; this suggests that the two kine 
of changes, in orientation and in locus, ar 
just about equally “effortful” under the giv 
conditions, 
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_ With the experimenter sitting to the sub- 
| ject's left, facing and stimulating the left side 
of his head, one might have expected that 


there would be a shift in the subject’s normal , 


angle of orientation from straight ahead 
(0 degrees) counterclockwise to somewhere 
between O degrees and .270 degrees, on an 
' analogy with the tendency to fixate the pe- 
riphery upon. stimulation írorh there. This 
would be reflected in am increase over blocks 
‘in the number of I responses as a result of 
the reduced angular change in orientation 
° necessary to perceive from an internal locus. 
Instead, the results show that over blocks 
more E responses were forthcoming. However, 
the above reasoning appears to be more ap- 

. propriate to the conditions of Part III than to 
, those of Part I. In Part I from trial to trial 
the subject did not have to assume a perspec- 

` tive again and again; he could take a per- 
spective and maintain it over trials. The 
trend shown in Figure 1 of more and more 
subjects responding consistently from a single 
perspective as trials proceeded is in line with 
this argument, The question remains as to 
why the external perspective became favored. 
The answer may be self-reinforcement by the 
subject for E responses since they represent 

our conventional definition of accuracy. 

In Part II, as a result of the experimenter's 
„instructions, the subjects appear to have 
taken a perspective and maintained it very 
"well, with an accuracy of response of about 
8596. Importantly for the argument that sub- 

' jects can take and maintain a perspective 
over trials, they did equally well under the 
two perspective instructions. In Part III, on 
the other hand, where it was necessary to 
take a perspective on every trial, I instruc- 
tions yielded far better performance than E 
instructions. To account for the latter a 
Change in the normal orientation of the per- 
ceiver is invoked resulting from the experi- 
menter's position relative to the subject and 
Perhaps the repeated stimulations on the left 
side of the head, The orientation having thus 
shifted to the left, a relatively small change 
in orientation is pitted against a change in 
locus. Consequently there would be an auto- 

' Matic tendency to give an I response despite 
the E instructions in Part III. 

It was expected that the 5 seconds between 
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instruction and tracing in Part III would 
result in better performance than when no 
interval between them was allowed, on the 
grounds that there is an advantage to assum- 
ing a perspective prior to stimulation, If I 
and E instructions yielded differences between 
perspectives under the O-interval condition, 
it was expected that the 5-second interval 
would reduce them by permitting time to take 
that perspective which upon stimulation 
would be less likely. The contrasting results 
under E instructions for Parts II and III 
also contribute to this expectation. It is un- 
likely that 5 seconds is insufficient time to 
take a perspective. The absence of an effect 
of the 5-second interval may be due to the 
subjects*intentness on the stimulation, lead- 
ing him to postpone taking a perspective 
until it was necessary. In other words, the 
subject may have treated each instruction as 
a way to respond to the stimuli rather than 
as indicating to him something he could do 
prior to stimulation. The little time between 
trials may have contributed to this by giving 
the subject little time to recover from the 
previous trial. 

The data on latencies are difficult to inter- 
pret because they reflect the time it takes 
to assume a perspective and then to respond 
from it, Since by previous argument the re- 
sults of Part II do not contain differences 
resulting from taking a perspective, the dif- 
ferences in mean latencies between E and I 
instructions must be attributed to differences 
in responding from the two perspectives. In 
Part III differences between instructions 
would reflect differences in both events. 

Throughout the various phases of the pres- 
ent study there were sizable individual dií- 
ferences in the ability of subjects to take one 
or the other perspective, Such differences 
require comment, especially since the con- 
ceptual scheme being used to organize the 
results cannot account for them. 

The speed and accuracy of imitation of the 
actions of a person who stands opposite, as 
opposed to alongside, also may serve as an 
indicator of the ease with which the locus 
of the perceiver can be shifted. A key ob- 
servation is the correctness with which right 
and left are distinguished in the imitation of 
laterally asymmetric actions. Werner (1948) 
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suggests that accuracy in this task shows that 
“the two positions are .. . seen from the 
standpoint of the person opposite [p. 174]." 
He refers to Gordon (1923) as having shown 
that only aíter his ninth year does a child 
correctly imitate the movements of the left 
and right hands of someone facing him, while 
cautioning the reader with regard to the 
relativity of the age mentioned to the com- 
plexity of the actions imitated. 'This suggests 
the possible usefulness of the imitation pro- 
cedure (as well as the closely related, tracings 
procedure of the present experiment), with 
sufficiently complex actions, in testing adults 
with respect to their relative fluidity of 
standpoint. 

According to Werner the correct imitation 
of "opposite" actions is a sign of greater 
“objectivation,” meaning greater differentia- 
tion of ego and object. He implies that such 
differentiation results in objects possessing 
a right and left of their own, and a stand- 
point of their own which the perceiver can 
take, To the extent that differentiation has 
not or, for immediate reasons, is not taking 
place, to that extent assuming an external 
perspective would be relatively difficult, as 
indexed by poorer accuracy and greater 
latencies on task requiring shifting the locus 
of the perceiver. 

Werner’s (1948) theory also suggests an 
important distinction between being able to 
take the standpoint of another person, and 
not being able to distinguish between the 
other’s and one’s own viewpoint (see also 
Piaget & Inhelder, 1956). Although the latter 
inability may give rise to phenomena re- 
sembling identification with the other, such 
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effects are due to fluidity of ego boundaries 
(poor differentiation) rather than to the as- 
sumption of the other’s perspective with an 
attendant knowledge of doing so. Asch 
(1952) has made a similar distinction in 
his discussion of sympathy. He contrasts 
sympathy and emotional contagion: 


It is because we become aware of the situation and 
experience of others that we can feel with them. 
The mere duplication of? an observed reaction may 
in fact be a sign of inadequate social relation, There 
are times when the sight of suffering merely reminds 
a person of his own suffering; when this is so, he 
has simply lost social contact [p. 172]. 


The experimental procedures of the present 
study and of Gordon (1923) show promise 
of being able to determine how well a person 
can take another’s viewpoint in the advanced 
sense of Werner and Asch. 2 
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Preschool children interacted with an adult female model who was either 
rewarding or nonrewarding and whose control over the child's future re- 
sources was either high or low. Thereafter, the children participated with the 
model in a game during which she behaved towards the child in ways de- 
signed to be directly aversive for him (e.g, stern criticism and imposed 
: delay of reward) and also displayed novel neutral behaviors (e.g. distinctive 
marching). Later, in the model's absence, an experimental confederate pro- 
vided stimuli that permitted S to reproduce the model's behaviors. Measures 
were taken of the child's rehearsal of the model's neutral and aversive be- 
haviors and of his transmission of these behaviots to the confederate. More 
children rehearsed the neutral behaviors of a rewarding than of a nonrewarding 
model and high as opposed to low control over the child’s future resources 


Recent investigations have demonstrated 
that children and adults who observe a mod- 
el’s behavior may reproduce that behavior 
‘even when they are not directly reinforced 
for it by social agents (Bandura & Walters, 
1963). In these studies models exhibited a 
' variety of behaviors which had no direct con- 
* sequences for the subject. For example, chil- 
dren observed an adult aggress against a doll 
' but they themselves were not the objects of 
aggression. However, in many life situations 
the behaviors exhibited by social models do 
have direct positive or negative consequences 
for the observer of the behaviors because he 
is also their recipient. Thus, during much of 
his socialization the child is tbe object, as 
well as the witness, of the behavior of pa- 
rental and other models and receives direct 
consequences from the behaviors they dis- 
play. The child who is spanked observes the 
parental disciplinary style but also experi- 
ences it. 


1This study was supported by Research Grant 
M-6830 from the National Institutes of Health, 
United States Public Health Service. Acknowledg- 
ment is also made to the Stanford University Nur- 
sery School. 


resulted in greater rehearsal of both neutral and aversive behaviors. Both re- 
wardingness and control enhanced behavior rehearsal, and children rehearsed 
both neutral and aversive behaviors most frequently when the model was high 
in rewardingness and control and least frequently when both these variables 
were low. The transmission of aversive behaviors was increased by the model’s 
initial rewardingness but not by her control. 


Tt has been commonly noted, anecdotally 
and in clinical observation, that individuals 
rehearse and transmit to others behaviors 
which had aversive consequences for them. 
For example, parents often claim they un- 
willingly behave towards their children in 
ways similar to those that produced pain for 
them when performed by their own parents. 
At present the factors governing the repro- 
duction of a model’s novel behaviors which 
were aversive to the individual in his inter- 
actions with that model remain ambiguous. 
Several theories (e.g., Maccoby, 1959; Sears, 
1957; Whiting & Child, 1953), including the 
well-known theory of “identification with the 
aggressor” (Bettelheim, 1943; Freud, 1946) 
have been invoked to account for this repro- 
duction and transmission of social punish- 
ments, and the difficulties of a secondary posi- 
tive reinforcement interpretation have been 
noted (e.g., Aronfreed, 1964). Relevant data, 
however, have been primarily clinical and 
informal and the very occurrence of the phe- 
nomenon itself has been rarely demonstrated 
in laboratory research. 

Notable exceptions are Aronfreed’s (1964) 
recent investigations into the origins of self- 
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criticism which illustrate that the timing of 
punishment is an important determinant of 
the acquisition of self-critical labels. These 
studies led Aronfreed to conclude that 


The reproduction of social punishment appears to 
be acquired only when the relevant components of 
punishment have a circumscribed temporal relation- 
ship to an anticipatory aversive state [and that] 
the reproduction of social punishment cannot be 
subsumed under the consequences of a model's re- 
warding characteristics [pp. 212-213]. 


However, this cannot be accepted as a firm 
conclusion because in Aronfreed's study on 
the effect of the model's social characteristics 
the manipulations of the model's rewarding 
Characteristics were confined to expressions 
of verbal and physical approval (e.g., patting 
the child on the head) during the experi- 
mental procedure. It should also be noted 
that these studies did not involve the trans- 
mission of aversive behaviors by the subject 
to another person but only their reproduction 
in the experimenter's presence, usually in re- 
Sponse to the experimenter's direct probes. 

There is considerable evidence that the 
characteristics of the model, such as his re- 
wardingness and power or control over re- 
sources, affect the extent to which some of 
his displayed behaviors are imitated by an 
observer (e.g, Bandura & Huston, 1961; 
Mussen & Distler, 1959). The unresolved 
question is whether the characteristics of the 
model can affect the reproduction of social 
punishments which the individual not only 
observed but also directly received from the 
model. The two characteristics of the model 
which were manipulated in the present study 
are considered major determinants of imita- 
tion in several prominent theories of iden- 
tification. The conceptualizations of Sears 
(1957), Mowrer (1950), and Whiting and 
Child (1953) all give the rewarding charac- 
teristics of the model a central role for the 
development of identification. For example, 
in Mowrer's formulations, if the model's be- 
havioral attributes are paired with positive 
reinforcements they acquire. secondary re- 
inforcing properties on the basis of classical 
conditioning and, through stimulus generali- 
zation also become rewarding when performed 
by the child. After the model's behaviors have 
been paired with positive reinforcement the 
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child can self-administer secondary reinforcers < 


by reproducing components of the behavior. 
In addition, Maccoby (1959) and Mussen 
and Distler focus on the model's control of 
resources as well as his rewardingness, stress- 
ing the model's social power as the character- 
istic which enhances identification. Maccoby, 


for example, suggests that the control exer-. 


cised by a model over resources important 
for the needs of the child will determine the 
amount of role practice in which the child 
engages. He will rehearse both the rewarding 
and the punishing characteristics of the model 
since both are relevant to him in guiding his 
plans about future actions. Because these be- 
haviors have been well rehearsed they are 
more likely to be performed when relevant 
eliciting stimuli evoke them by contiguity. 

In the present study rewardingness was 
manipulated by varying the degree to which 
the model provided the child with both mate- 
rial and social noncontingent reward. Power 
or control over both positive and negative 
outcomes was manipulated cognitively by 
varying the model’s role..For half the children 
the model was introduced as a visiting teacher 
who would never reappear while for the other 
half she was introduced as the child’s new 
teacher. 

Thus, the present study investigated the 
effects of the model's rewardingness or use of 
noncontingent reinforcement, and his control 
over the subject's future resources, on the 


degree to which behaviors displayed by the | 


model without direct consequences for the 
subject (neutral? behaviors) and those di- 
rected at the subject with negative conse 
quences for him (“aversive” behaviors) arè 
rehearsed and transmitted. The main pur- 
poses of this study were to: demonstrate the 
occurrence of both rehearsal and transmission 
of aversive behavior; investigate the relative 
effectiveness of noncontingent reinforcement 
by the model and his control over future 
resources in producing this rehearsal and 
transmission; and compare the determinants 
of the rehearsal and transmission of such 
initially aversive behaviors with those of neu 
tral behaviors displayed by the model. 
Preschool children were exposed to an adult 
female model whose noncontingent rewarding: 
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.ness and future control over the child were 


varied. Thereafter, the children participated 
with the model in a "special game" which in- 
volved playing with a cash register, making 
change with play coins and bills, etc. The 
model included the following behaviors during 
this interaction: she was aversive to the child 
in novel ways (aversive behavior) and she ex- 
hibited novel behaviors witli no direct re- 
inforcement consequentes for the child (neu- 
tral behavior). More specifically, the aversive 
acts consisted of imposing delay of reward, 
removal of reward, and criticism. The model- 


_ing of neutral behaviors consisted of emitting 


distinctive verbal and motor behaviors (e.g., 
marching around the room while saying 
*March! March! March!"). The aversive 
behaviors were designed to have direct nega- 
tive consequences for the subject whereas the 
neutral behaviors were merely modeled with- 
out direct reinforcement consequences for 
him. In the former instances the child was 
the object of the behavior whereas in the lat- 
ter he was only the observer of the displayed 
behaviors. Following these treatments, the 
subject’s task was to show another person 
who was dressed as a clown how to play the 
cash register game in the model's absence. 


` Measures were taken of the rehearsal and re- 
production of novel behaviors in the model's 
`, presence, and of the transmission of neutral 


and aversive behaviors to the "clown" in her 
absence. 

In accord with the above-mentioned the- 
ories of identification which stress both the 
model's rewardingness and power, it was an- 
ticipated that the reproduction of both neu- 
tral and aversive behaviors would be more 
frequent when the model’s noncontingent re- 
inforcement was high than when it was low. 
The combination of noncontingent reinforce- 
ment and future control over resources was 
expected to result in the most frequent repro- 
duction of the model’s behaviors. Conversely, 
it was expected that reproduction would be 
least frequent when the combination of non- 
contingent reinforcement and future control 
is lowest. The other combinations of reward- 
ingness and control were expected to fall be- 
tween these extremes, with no prediction 
made concerning their relative strength. 
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METHOD 
Subjects 


The subjects were 36 Stanford nursery school 
children ranging in age from 37 months to 64 
months with a mean age of 52 months. There were 
31 boys and 25 girls and the same adult female ex- 
perimenter was used with all children. The children 
were assigned randomly to each treatment, with 
separate assignments for each sex to assure that 
similar proportions of boys and girls were included 
in each condition. 


The Model's Rewardingness and Control 


Four manipulations were used to vary the model's 
rewardingness and control: 

High noncontingent reward (high reward). Attrac- 
tive toys were available (e.g. a bowling set, battery- 
operated cars and planes, musical tops, hand pup- 
pets) and the model and subject played together 
with them, the model showing interest, affection, 
and warmth, and frequently praising the child. In 
addition, she dispensed attractive stickers and cookies 
to the child. The intent was to establish the model 
as noncontingently rewarding, with highly attractive 
resources at her disposal. These operations were simi- 
lar to those usually labeled “nurturance” (e.g, Ban- 
dura & Huston, 1961). 

Low noncontingent reward (low reward). There 
were less attractive toys available (e.g, coloring 
book, used crayons, broken toys) and the model 
informed the child that she had some work to com- 
plete saying, “I don't have too many toys here, but 
you can play with them for awhile. I'll be right over 
here working [pointing to corner of room]." The 
model responded minimally but pleasantly to atten- 
tion bids, indicating that she had to finish her work. 

High future control (high control). The model 
structured her role as that of the child’s new nursery 
school teacher, and she commented that they would 
therefore be seeing a lot of each other. 

Low future control (low control), The model 
structured her role as that of a visiting teacher who 
was leaving the nursery school in an hour to take 
the bus back home to Milwaukee. She commented 
that they would therefore not be seeing each other 


again. 


Experimental Groups 


Each of the four experimental treatment groups 
contained 14 children and consisted of these combi- 
nations of the above-described operations: high 
reward-high control, high reward-low control, low 
reward-high control, low reward-low control Each 
involved a 20-minute interaction between the adult 
and child. There were 6 girls and 8 boys in each 
group except the low reward-high future control 
condition in which there were 7 boys and 7 girls. 


Procedure 


The experimenter introduced herself to the child, 
identifying herself either as his new nursery school 


200 


teacher or as a visiting teacher who was leaving that 
same day (high or low control). The 20-minute play 
session followed in which she displayed either high- 
or low-reward behavior. The experimenter again re- 
minded the subject of her future role, and then took 
him to another experimental room to “play a special 
game with a toy cash register." On entering the room 
the subject was shown a large container of toys and 
was allowed to select the one he wanted most. This 
toy was placed in a bag and given to the child and 
he was told that he could take it home. 


Presentation of Neutral and Aversive Be- 
havior 


The experimenter and child seated themselves in 
front of the toy cash register. The game involved 
playing store with a cash register, making change, 
opening and closing the register drawer, hitting the 
register keys, etc. During the game, all subjects were 
exposed to the following two kinds of behaviors: 

1. Neutral behaviors. The model hit a key on the 
cash register and said “Bop,” marched around the 
table saying “March, march, march, march, march,” 
and repeated this sequence two more times. 

2. Aversive behaviors. (a) Imposed delay—When 
the child touched the cash register for the first time, 
the model said that if one wants to play with any- 
thing badly enough one ought to be able to wait for 
it, and instructed the subject to sit still with his 
hands in his lap until she finished counting. She then 
very slowly and methodically repeated the numbers 
“1, 2, 3" 15 times. (b) Criticism and removal of 
reward—The cash register was constrücted so that, 
unknown to the subject, the model could make the 
drawer come all the way out when it was opened, 
giving the appearance that it was broken. When this 
happened, as the child “broke” the drawer the ex- 
perimenter exclaimed sharply, “Oh my! Do you 
know what this makes you? It makes you a store- 
wrecker, and when you're a storewrecker, you lose 
your toy." She then removed the toy the child had 
received previously, saying sternly “You try not to 
be a storewrecker again." 

The model performed the neutral behaviors and 
the counting at two different times in the course 
of playing cash register with the child whereas the 
drawer was broken once. The children's reactions to 
the aversive behaviors varied from tears to silent 
but obvious tension and indicated that the behaviors 
had painful consequences for them. This was further 
substantiated by the fact that seven subjects (not 
included in the total N) had to be eliminated because 
they cried and became too upset to continue partici- 
pating. This occurred with similar frequency across 
treatment conditions. At the conclusion of the learn- 
ing session the child was left alone to play with the 
cash register in the experimental room for 3 minutes 
during which he was observed through a one-way 
mirror by the experimenter and her confederate. 
The purpose of this interval was to reduce any im- 
mediate emotional arousal stemming from the inter- 
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action and to observe any additional practice of the | 
model’s behaviors during her absence. 


Transmission 


The experimenter returned to the experimental 
room and led the child back to the room in which © 
the play session had taken place. She informed him 
that, as a special treat, he was going to be allowed 
to show someone else how to play the cash register 
game. All the events that had occurred during the 
game, including the neutral and negative behaviors, 
were reviewed verbally by the experimenter who also 
reminded the child again of her future role (high or - 
low control), The subject was told that the person 
he was going to teach was a girl dressed up as a 
clown. The adult and child then returned to the ex- 
perimental room where an adult female experimental E 
confederate was seated in front of the cash register, 
The confederate was dressed as a clown to disinhibit 
children who might be reluctant to relate with a 
strange adult in novel and aversive ways. As the 
subject and the model came into the room the clown 
began to play with the cash register. The model play- 
fully tapped the clown's hand and told her not to 
play until she was told what to do by the subject. 
The model then left the subject and clown together, 
and observed them through a one-way mirror in an 
adjoining room. 

The clown behaved in a pleasant way to the sub- 
ject, nodding and bowing ocgasionally. When asked 
any questions about her background she answered 
minimally, for example, she was just pretending to - 
be a clown, she lived down the road, she did not 
have any age when she was a clown. If the subject 
did not immediately show the clown how to play, 
the clown attempted twice to elicit this behavior by - 
saying, *Can you show me what to do" and E. 
really want to learn how to play." If the subject still 
made no response the clown began to play with the 
cash register and money by herself. Always, in the 
course of the transmission session, the clown broke 
the drawer and exclaimed, *Oh look what hap- 
pened!" When the subject stopped demonstrating 
the clown said, after 10 seconds had elapsed, “Ts 
there anything else? Can you show me what else 
to do?” 


Measures of Rehearsal and Transmission 


“Behavior rehearsal” refers to reproduction of 
any aspects of the model’s distinctive neutral or aver- 
sive behaviors, either in her presence while she pal 
ticipated with the children in the cash register game 
or during the interval in which the child was alone 
before his interaction with the clown. “Behavior 
transmission” was scored when the child enacted any 
aspects of the model’s neutral or aversive behaviors 
directly towards the clown while showing him the 
game. Because the referents for behavior rehearsāii 
and transmission involved the presence or absence 0” 
clear overt behaviors (e.g, marching, counting) scot 
ing was unambiguous. Independent scoring was done 
by the experimenter and the confederate who serve® 
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as clown and yielded perfect agreement with only 
one exception. In the transmission phase the con- 
federate recorded her independent scoring of the 
child's behavior at the end of her interaction with 
him. Aíter the experimental procedure was com- 
pleted each child obtained toys and warm approval 
for his performance in a brief play session. 


RESULTS 


Twenty-six children, or alniost 50% of the 
total, did not rehearse*or transmit either neu- 
tral or aversive behaviors. The percentage of 
children in each treatment condition who re- 
produced none of the model’s behaviors was 
21 in the high reward-high control group, 50 
in the high reward-low control group, 64 in 
the low reward-high control group, and 50 in 
the low reward-low control group. Because 
of this highly skewed distribution the data 
were analyzed with chi-square tests." Inspec- 
tion of the data for sex differences indicated 
no trends and scores for males and females 
were therefore combined. Chi-square compari- 
sons between treatment conditions were com- 
puted separately for neutral and aversive be- 
haviors and for behavior rehearsal and behav- 
ior transmission. Im view of the lack of ap- 
propriate eliciting stimuli during the rehearsal 
phase it was not expected that "storewrecker" 

. would be repeated and indeed only one child 
did so, all other rehearsals of aversive behav- 
iors consisting of slowly counting “1, 2, 3,” 
aloud in the manner modeled by the experi- 
menter during the imposed delay periods. In 
contrast, during the transmission phase, chil- 
dren called the clown a storewrecker and im- 
posed delay periods on him by counting repe- 
titiously with approximately equal frequency. 
With the exception of the virtual absence of 
"storewrecker" responses during rehearsal, in- 
spection of the data for reproduction of each 


e 

2 Chi-squares were corrected for continuity when- 
ever df was 1 and the expected frequency in any 
cell was less than 10. All p values are based on one- 
tailed tests, 

3 During the transmission phase the clown was 
not given a toy and therefore subjects did not have 
an opportunity to remove it in the manner used by 
the model towards the child during the initial inter- 
action. The clown’s script was designed only to 
elicit imitation of the model's aversive verbal be- 
haviors, namely, the use of the label “storewrecker” 
and repetitious counting in the style employed by 
gu when she imposed delay periods on the 
child. 
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Fic. 1. Number of subjects rehearsing neutral and 
aversive behaviors as a function of the model’s re- 
wardingness (R) and future control (FC). 


separate aspect of the model’s neutral behav- 
iors on the one hand, and her aversive behav- 
iors on the other, revealed no systematic pat- 
tern differences within these two classes of 
behavior. Therefore these separate aspects 
were combined and the four final scores as- 
signed to each child indicated the presence or 
absence of imitative neutral behavior, and 
imitative aversive behaviors, respectively, 
computed for the rehearsal phase and the 
transmission phase separately.* 


Behavior Rehearsal 


Figure 1 shows the number of subjects in 
each treatment condition who rehearsed neu- 
tral and aversive behaviors, Chi-square tests 
comparing the relevant treatment groups and 
treatment combinations on rehearsal of each 
class of behavior are included in Table 1. As 
predicted, significantly more children re- 
hearsed both aversive (p < .025) and neutral 
(p < .005) behaviors when the model was 
both highly rewarding and had future control 
than when her rewardingness and control were 
low. 

Comparisons of the two high-control groups 


4The data on behavior rehearsal are based pri- 
marily on rehearsal in the model’s presence. During 
the rehearsal phase, only four children imitated the 
model’s behavior while alone. Of these, three re- 
hearsed both neutral and aversive behavior whereas 
one rehearsed only aversive behavior. None of these 
four children rehearsed the model’s behaviors in her 
presence. 
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showed that rewardingness significantly af- 
fected the rehearsal of neutral (y? = 3.65, 
p < .05) but not aversive behavior (x? = .65), 
although the trend was in the same direction 
for aversive behavior. That is, with high con- 
trol, high as opposed to low rewardingness 
produced greater rehearsal of neutral behavior 
but the effect was not significant for aversive 
behavior. Likewise, significantly more sub- 
jects in the two high-reward groups combined 
rehearsed neutral (p < .05) but not aver- 
sive (x? —.53) behaviors than in the two 
low-reward conditions combined. 

In the high reward-high control group, 
more subjects rehearsed both aversive (x^ 
= 5.30, p< .025) and neutral (x? = 5.39, 
p<.025) behaviors than in the high 
reward-low control group. That is, with 
equally high rewardingness, high as opposed 
to low control produced greater rehearsal of 
both aversive and neutral behavior. The po- 
tency of future control, with reward constant, 
is further demonstrated by the fact that, as 
predicted, significantly more children in the 
two high-control groups combined rehearsed 
both aversive (p < .005) and neutral (p 
< .01) behaviors than in the two low-control 
conditions combined. Thus, rewardingness 
significantly increased the rehearsal of neutral 
but not aversive behaviors whereas control 
affected the rehearsal of both aversive and 
neutral behaviors. It is striking that in the 
two low-control treatments not a single sub- 
ject rehearsed aversive behavior (Figure 1). 

It should also be noted that when either 


TABLE 1 


BETWEEN-TREATMENT CHI-SQUARE COMPARISONS 
OF SUBJECTS REPRODUCING AVERSIVE AND 
NEUTRAL BEHAVIOR IN EACH. PHASE 


Rehearsal Transmission 
Treatment 


comparisons 


Aversive | Neutral | Aversive | Neutral 

High reward-high | 530** | 7.62" | 16 E 
control versus 

low reward- 

low control 

High versus low 53 3.28* 2.87% at 
reward 

High versus low — |8.47*^** | 5.83%" 32 a1 
control 


Note.—df = 1, p one-tailed, 


weet 3 < 005. 
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[J TRANSMISSION OF NEUTRAL BEHAVIORS 
WEN TRANSMISSION OF AVERSIVE BEHAVIORS 


NUMBER OF SUBJECTS 
-nuronan o 


HIGH R HIGH R LOWR LOW R 
HIGH FC LOW FC HIGH FC Low FC 
TREATMENTS 


Fic. 2. Number of subjects transmitting neutral 


and aversive behaviors as a function of the model's 
rewardingness (R) and future control (FC). 


reward or control were low the other variable 
did not produce significant differences in the 
rehearsal of either aversive or neutral behav- 
iors, That is, when the model’s rewardingness 
was low, differences in her control did not 
affect behavior rehearsal and, conversely, 
when her control of the child's future was low, 
variations in her rewardingness did not result 
in significant differences in behavior rehearsal. 
Likewise, there were no significant differences 
in behavior rehearsal between treatments in 
which either reward or control was high when 
the other variable was low and the condition 
in which both reward and control were low. 
Tf it were possible to measure an interaction 
effect in these data, it would probably be 
sizable. 


Behavior Transmission 


The number of subjects in each group who 
transmitted neutral and aversive behaviors t0 
the clown is shown in Figure 2. Comparison 
of the combined high-reward conditions with 
the combined low-reward groups, presented in 
Table 1, shows that the model’s rewardingness 
led to greater transmission of aversive behav- 
iors (p < .05) but did not affect the trans 
mission of neutral behaviors. Table 1 alsó 


5]t is possible that a fatigue factor was operative 
because children rarely reproduced the model’s be- 
haviors in more than one phase. Only 22% of the 
children who rehearsed the model’s aversive be 
haviors and 33% who rehearsed her aversive 
haviors subsequently transmitted these behaviors 


Tw» 


| 
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shows that the model's future control did not 
affect the extent to which the children trans- 
mitted any of her behaviors. Likewise, the 
condition in which the model was both re- 
warding and had high control did not produce 
more behavior transmission than the treat- 
ment in which the model’s rewardingness and 
control were both low. 


Discussion 


The results demonstrate that observed be- 
haviors may be reproduced and transmitted 
to others without external reinforcement for 
their performance, even when the observer 
was the object of the modeled behaviors and 
received aversive consequences from them. 
Moreover, the extent to which the model’s 
behaviors were reproduced was affected by 
her rewardingness or use of noncontingent re- 
inforcement and her future control over the 
subject. 

The present data suggest that the degree to 
which subjects reproduce components of the 
aversive, as well as the neutral, behavior 
which they observed and whose outcomes 
they directly experienced, is determined, in 
part, by the characteristics of the social agent 
who initially performed the behaviors. It is 
noteworthy that the percentage (34) of trans- 


' mitted aversive behavior exceeded the per- 


centage (21) of transmitted neutral behavior. 

The obtained results support theoretical 
formulations stressing the rewardingness of 
the model and his power as determinants of 
the degree to which his behavior is adopted 
and indicate that both variables are useful 
for an adequate theory of imitation. The find- 
ings indicate, however, that these two vari- 
ables have somewhat different effects as a 
function of the type of behavier displayed by 
the model and the stimulus situation in which 
the subject reproduces it. Both reward and 
control significantly affected the rehearsal of 
neutral behavior. Aversive behaviors were re- 
hearsed only when the model had high con- 
trol, and even with a highly rewarding model 
hot a single child rehearsed them if the mod- 


Such a fatigue factor may account for the lack of 


lerences between treatments in the transmission 
of neutral behaviors. 
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el's control was low. Both neutral and aver- 
sive behavior rehearsal were rare in all condi- 
tions in which either reward or control was 
low (see Figure 1). Indeed, when reward was 
low variations in control did not significantly 
affect behavior rehearsal; conversely, when 
control was low variations in reward did not 
produce differences in rehearsal. The fact that 
the differences produced in behavior rehearsal 
by one variable were enhanced by the pres- 
ence of the other suggests that reward from 
the model may be a necessary condition for 
the effectiveness of his control and likewise 
that a rewarding model who has little con- 
trol over the subject's future resources is no 
more effective than one who is not rewarding 
and dóes not have control. 

In contrast, the transmission of the model’s 
behavior by the child to another person was 
affected only by the model's rewardingness 
and not by his future control. Children who 
had been exposed to a noncontingently re- 
warding model transmitted components of her 
aversive behavior to another person more fre- 
quently than those who had received little 
noncontingent reinforcement. Although not 
impressively strong, this result was signifi- 
cant whereas there was no indication that the 
model's future control over the child affected 
the extent to which he transmitted the mod- 
el's behavior. Indeed, rewardingness and fu- 
ture control may have opposite effects on the 
transmission of aversive behavior and these 
antagonistic effects may have tended to cancel 
each other. Exposure to a noncontingently re- 
warding model may have disinhibited the chil- 
dren about transmitting aversive behaviors to 
another person, whereas exposure to a model 
who had great control over the child's fu- 
ture may have served to inhibit the transmis- 
sion of such aggressive behavior, even in the 
physical absence of the model. Calling an 
adult, albeit one dressed as a clown, a “store- 
wrecker” and making him wait while the 
child repetitiously counts “1, 2, 3” is unlikely 
to incur the pleasure of a “new nursery school 
teacher” even if the same behavior was origi- 
nally displayed by her. To be sure, the model 
was absent in this phase of the study but it 
is not unlikely that the child’s behavior was 
influenced by expectations concerning what 
would please her. Subjects may also have 
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feared that the clown would report their be- 
havior to the teacher. Fear of negative conse- 
quences from the model for aggression to the 
other adult would inhibit children who be- 
lieved the model was their new teacher more 
than those who believed she was about to 
leave their school forever. Although the dif- 
ferences were not significant, children trans- 
mitted aversive behaviors most frequently 
when the model had been rewarding and had 
low control and least frequently when she 
was not rewarding but had high control (see 
Figure 2). These trends support the inter- 
pretation that rewardingness disinhibited the 
children, thus enhancing transmission of aver- 
sive behaviors, whereas control over the chil- 
dren's future inhibited the transmission of 
aversive behaviors because of the greater like- 
lihood of delayed negative consequences from 
the powerful model. 

The results on transmission of aversive be- 
haviors are in direct opposition to those an- 
ticipated by theories of defensive identifica- 
tion with the aggressor. Such formulations 
predict that the punitive behaviors displayed 
by a highly powerful, and thus potentially 
threatening, model would be transmitted most 
frequently. Instead, the data indicate that 
such transmission was facilitated by the mod- 
el’s rewardingness, irrespective of her control 
over the subject. The effects of high control 
when the model himself strongly encourages 
and disinhibits the subjects about transmit- 
ting punitive behaviors remain unknown and 
merit investigation. 

An attempt was made in the present study 
to determine whether the two variables of re- 
wardingness and control affect only the child’s 
performance of a model’s behavior or whether 
they affect the acquisition or learning of those 
behaviors. If the obtained treatment effects 
were due to differences in learning these were 
probably mediated by differences in the 
amount of attention given the model as a 
function of her manipulated social character- 
istics. It is plausible that a new teacher or 
rewarding adult was more closely observed 
than a stranger or nonrewarding adult. Im- 

mediately after the transmission phase the 
child was asked by both the model and the 
clown to recall what had happened when the 
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model had first showed him how to play store, , 4 


For each item of the model's behavior which 
the child correctly recalled he was given an 
attractive sticker and warmly praised. There 
were no differences between the groups in the 
number of children recalling either aversive or 
neutral behaviors, although there was a slight, 
but not significant, trend for more children in 
the low reward-high control group to recall 
neutral behavior. These data were considered 
unsatisfactory, however, since many of the 
children who reproduced behavior did not re- 
call it. This was true of 40% of the children 
with respect to aversive behavior and 50% of 
the children with respect to neutral behavior. 
In spite of their obvious inadequacy as a 
measure of total learning it is clear from these 
data that the behaviors which the children 
learned considerably exceeded those which 
they performed. Thus 2196 of the subjects 
recalled aspects of the model's aversive be- 
havior that they did not perform while 2596 
recalled aspects of the model's neutral be- 
havior that they did not perform. In addition, 
of the 26 subjects who performed none of the 
model’s behaviors only 11 were unable to re- 
call any of her behaviors. The discrepancy 
between performed and acquired behaviors 


was most striking in the low reward-high con- : 


trol group. Whereas only 5 subjects performed 
aspects of the model's behavior 6 who did not 
perform it recalled it. The deficiencies of the 
measure prevent firm conclusions but these 
data are suggestive that the obtained differ- 
ences between the experimental groups re- 
flect differences in performance and not in 
acquisition. 
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OPPONENT'S PERSONALITY, EXPECTATION OF SOCIAL 
INTERACTION, AND INTERPERSONAL BARGAINING * 


DAVID MARLOWE, KENNETH J. GERGEN, ayp ANTHONY N. DOOB? 


Harvard University 


In 2 separate experiments Ss participated in a 2-person, non-zero-sum game 
with an opponent whose behavior during the game wag predominantly co- 
operative. In Exp. I Ss who anticipated further interaction with their opponent 
were less exploitative than those who did not. In Exp. II it was pfedicted that 
Ss would exploit an egotistical opponent to a greater extent when future 
interaction was anticipated than when it was not, but that when the opponent 
was seen to be selí-effacing the reverse would be true. The results supported 


the prediction. 


The analysis of bargaining behavior within 
the context of two-person games has been 
the subject of considerable research in recent 
years. In contrast to an early concern with 
economic and mathematical considerations 
(Luce & Raiffa, 1957), recent investigations 
have emphasized the interpersonal aspects of 
the bargaining situation. These latter studies 
view bargaining as a special instance of social 
interaction amenable to analysis within the 
framework of experimental social psychology. 
Deutsch and Krauss (1960), for example, 
studied the effects of unilateral and bilateral 
threat on the ability of persons to reach agree- 
ment in a simulation game; Scodel, Minas, 
Ratoosh, and Lipetz (1959) and Minas, 
Scodel, Marlowe and Rawson (1960) ex- 
amined bargaining behavior in a two-person 
game under varying communication condi- 
tions, payoff values, and opponent strategies; 
Deutsch (1958) related bargaining behavior 
to the variables of trust and suspicion; and 
Solomon (1960) investigated the influence of 
various power relationships on bargaining 
strategies. The focus in the majority of these 
studies has been on the degree to which a 
person will cooperate with or exploit an op- 
ponent under varying stimulus conditions. In 
general, these investigations seem to indicate 
that unless cooperative strategies are fostered 
through special experimental instructions or 
intersubject communication, persons will 
choose to compete with one another and will 
play in such a way as to maximize the differ- 
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ence between their own and their partne 
payoffs. 

Given the recent trend toward vie 
economic bargaining as a form of social i 
action, it has been only natural to seek pers 
ality and attitudinal correlates of barga 
behavior. Using a two-person game of 
prisoner’s-dilemma format, Deutsch (1 
reported that authoritarians (as measured 
the F Scale) tended to be less trusting of 
other player and to make more uncooperati 
choices. Marlowe (1963) found that p 
dependent persons were disposed to res 
to unconditionally cooperative behavior 
cooperation on their own part. And Lul 
(1960) reported that internationally 
as compared to isolationists were more € 
operative in a two-person game. Though 
in number, such experimental studies are 
sistent in indicating that reliable diffe 
in bargaining behavior are associated 
personality predispositions of the play 

The present investigation sought to 
tinue and broaden the emphasis on © 
social aspects of the bargaining situatiom 
concentrating on two variables which 
basic to most social relationships. 
two persons are engaged in a real-life b 
ing transaction, there are at least two 1 
of information which are focal in determ 
ing the course of the relationship. 
we generally form at the outset of the 
lationship some impression of the ki 
person with whom we are dealing. 
impressions may determine to a large deg 
the type of behavior manifested toward 
other person (cf. Davis, 1962; Ger 


EXPECTATION AND INTERPERSONAL BARGAINING 


Jones, 1963). Second, one's estimate of the 
longevity of a relationship may also play an 
important role. For example, Thibaut and 
Kelley (1959) have discussed the fact that 
persons are often more revealing to total 
strangers than to close friends; Gergen and 
Wishnov (1965) have explored the effects 
of varying: the degree of anticipated inter- 
action on the way in which a person will 
present himself to anóther. 

The variable of interaction anticipation 
would seem to have special significance for 
the area of interpersonal bargaining. The vast 
majority of bargaining studies have been con- 
ducted under conditions in which subjects 
expected no future confrontation with their 
opponents. However, in everyday relation- 
ships, persons seldom find themselves in the 
highly transient bargaining arena best char- 
acterized, perhaps, by the New York Stock 
Exchange. More typically, people expect to 
“live” with their behavior, and expect to de- 
fend and discuss it in ongoing relationships. 
This paper, then, reports on two investiga- 
tions which soughf to relate bargaining be- 
havior in a two-person game to the variables 
of the perceived personality of the opponent * 
and expectation of future interaction with the 
opponent. 

In the first investigation (Experiment I) 
the aim was simply to demonstrate that when 
bargaining with a predominantly cooperative 
other, persons will play more cooperatively 
when expecting to be confronted by this other 
than when no postbargaining interaction is 
anticipated. Here it was reasoned that the 
exploitation of another person who is trying 
to be cooperative is socially undesirable be- 
havior. Thus a face-to-face meeting with an 
exploited partner would create embarrassment, 
if not arouse guilt for the subject. Hence, it 
was felt that subjects would feel a stronger 
disposition to do the gracious thing and co- 
operate when they expected to be confronted 
with their behavior. 

Given that the expectation of future inter- 


3 Although the term “opponent” has traditionally 
been used to refer to the participants in two-person 
games, in many instances the term is misleading. 
In this study, as in many others, the notion of a 
Same or competition is never specifically introduced 
to the subjects by the experimenter. 
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action will induce greater cooperation, Ex- 
periment II attempted to show that inter- 
action anticipation would have quite different 
effects on bargaining behavior depending on 
the perceived personality characteristics of 
the opponent. At least one major dimension 
along which persons often characterize others 
is that of egotism versus humility. As Gergen 
and Wishnov (1965) have reasoned, inter- 
acting with an egotist can often be threaten- 
ing, and at the outset tends to reduce one's 
power in a social relationship. The optimum 
strategy for restoring power when bargaining 
with an egotist would seem to be successful 
competition. It was further felt that this 
tactic would most likely occur when further 
interaction was anticipated. There would seem 
to be less need to restore one's power in a 
relationship which is seen to be short-lived. 
It was hypothesized then that when dealing 
with a partner who describes himself in 
grandiose terms, the direct effects of antici- 
pated interaction (Experiment I) would not 
hold. Rather, it was expected that when 
bargaining with an egotistical other the an- 
ticipation of further interaction would lead to 
greater exploitation of the partner. 

On the other hand, a person who is seen to 
be self-effacing and humble seems to elicit 
quite different reactions. Such a person often 
evokes feelings of pity and nurturance on 
the part of others. To exploit such a person 
would give rise to feelings of guilt. Under 
public scrutiny, the exploitation of such a 
person would also be conspicuously undesir- 
able. However, it is also just such a person 
who invites exploitation. The person who dis- 
plays only his shortcomings lays himself open 
to the other's advantage. The variation in 
anticipated interaction should make an im- 
portant difference concerning which of these 
two reactions predominates. Feelings of guilt 
and the undesirable features of exploitation 
should be particularly salient when further 
interaction is expected. On the other hand, 
when one will not be held responsible for his 
behavior, or have to face his victim, the 
invitation to exploitation should be accepted. 
It was thus felt that when dealing with a 
self-effacing other, the findings of the initial 
experiment would be replicated even more 
dramatically. That is, when further inter- 
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action is anticipated, little exploitation should 
take place; when no further interaction is 
expected, exploitation should be maximal. 
The hypothesis for Experiment II can thus be 
summarized as follows: When interacting with 
a predominantly cooperative other, with 
anticipation of further interaction, subjects 
will exploit this other to a greater degree if 
he is seen to be egotistical than if he is seen 
to be self-effacing; when no further inter- 
action is anticipated, the reverse will be true. 


EXPERIMENT I 


"This experiment was designed to study the 
effects of an anticipated social interaction on 
bargaining behavior. The specific hypothesis 
was that subjects who bargain with an ex- 
pectation of later meeting their opponent will 
make significantly more cooperative choices 
than persons who bargain with no expectation 
of later meeting their opponent. 


Method 


Subjects. Twenty-three college males, all íresh- 
men at Harvard University, participated in the 
experiment. Students were chosen at random from 
the university directory and asked to serve in a 
study of bargaining and decision making. Virtually 
all the students contacted agreed to serve. The sub- 
jects were informed that they would be paid a 
minimum of $1.50 for their time, and that they 
would “have an opportunity to make more money." 

Procedure. Subjects participated in the experiment 
in groups ranging in size from four to six. Upon 
arriving at the experimental site each subject was 
seated at a large table. The table was partitioned in 
such a way that subjects could neither see, signal, 
nor otherwise communicate with each other. The 
experiment was described by the experimenter as 
focusing on the way in which persons make decisions 
when bargaining with each other under restricted 
conditions. Subjects were told that each would be 
working with a partner, ostensibly one of the other 
subjects at the table. A series of decisions were to 
be made by each subject and his partner, and as a 
result of these decisions various amounts of money 
could be made. During these instructions and those 
which followed the experimenter never used such 
words as “win,” “lose,” “beat,” “game,” “op- 
ponent.” In this way it was hoped that the arousal 
of strictly competitive motives could be avoided. 

Subjects were then introduced to what has formally 
been termed a non-zero-sum game. Such a game can 
be contrasted with the zero-sum game in which 
there is no possibility for players to increase their 
payoffs through cooperation. Subjects were told 
that they would be faced with a number of decision 
trials and that on each trial they would be re- 
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quired to choose between pressing either a black or, 
a red button. Each subject was led to believe that 

his partner would simultaneously be deciding which 

of the two buttons to press. On each trial there 

were thus four possible combinations which could 

occur for any subject pair. Subjects were then 

introduced to a payoff matrix which was posted in 

front of each subject and which displayed the 

amounts of money each would obtain for each of 

the four combinations. It was explained to the 

subjects that if on any trial both partners chose 

red, each would receive $.01; if both chose black each 

would receive $.03; if one chose red and the other 

chose black, the first would receive $.05 and the 

second would receive nothing. As can be seen, this 

game is of the standard “prisoner’s-dilemma” variety 

(cf. Rapoport & Orwant, 1962). i 

After the game had been thoroughly explained, 
half of the subject groups (confrontation con- 
dition) were told: 


In the past we have found that we can learn 
much more about what went on if we have each 
pair of subjects meet with me to discuss why they 
behaved as they did. So, after we have finished 
here, you will meet with the person you bargained 
with to discuss why you behaved as you did. 


For the remaining subject pairs these instructions 
were omitted (no-confrontation condition). 

The apparatus used to conduct the game was 
similar to that devised by Crutchfield (1955) to 
study conformity behavior, With this equipment 
the subjects’ choices register on the experimenter's 
panel. By throwing various switches the experimenter . 
can inform each subject of the choice which his 
partner has made, and at the same time also in- 
dicate the resultant payoff. The experimenter can, of 
course, supply the subjects with inaccurate informa- 
tion and announce whatever choices the concerns 
of the study demand. The present experiment con- 
sisted of 30 trials. On 24 of these trials each subject 
was informed that his partner had chosen "Black" 
(the cooperative choice). The "partners" Red 
choices were randomly distributed over Trials 4-30. 
Each subject was thus led to believe that his 
partner was playing a predominantly cooperative 
game, and was thus faced with a dilemma. He could 
select Black and thereby maximize joint gain ($.03 
each), or he couid press Red and exploit his partner’s 
good intentions ($.05 for him and nothing for his 
partner). Inasmuch as it was the only choice which 
avoided the maximization of joint gain, the number 
of trials on which the subject selected Red served as 
a measure of exploitation. 

After the game had been terminated all subjects 
Were supplied with an adjective check list containing 
81 adjectives arranged in alphabetical order. Subjects 
were asked to place a check by those 10 adjectives 
which they felt best described their partner, After 
completing the check list subjects in the confronta- 
tion condition were informed that they would not 
have to meet their partner because, “we are short of 
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time and have collected enough data.” After paying 
the subjects what they had earned in the game and 
for their participation in the experiment, the sub- 
jects were allowed to depart. 


Results and Discussion 


The mean number of Red (exploitative) 
choices made by the subjects in the no- 
confrontation conditions was.24.5 (n = 12), 
as compared to a megn of 19.1 in the con- 
frontation condition (» — 11). The difference 
between these means is significant beyond the 
.05 level (¢ = 2.34).* Thus, in line with the 
prediction, subjects who expected to discuss 
their behavior after bargaining were signifi- 
cantly less exploitative (or more cooperative). 

It was ventured above that the exploitation 
of a well-intentioned other, combined with 
the knowledge that one will be confronted 
by this other, leads to anticipatory guilt or 
embarrassment and thus the avoidance of 
exploitation. However, this explanation de- 
pends on the supposition that subjects did, 
indeed, perceive the partner as well-inten- 
tioned. Although four out of every five choices 
Which the partnere made were cooperative, 
there is an alternative way in which such 
Choices might have been viewed by the sub- 
jects. The persistent selection of Black could 
have been seen as an attempt on the partner's 
part to seduce the subject into choosing Black, 


^ with the partner then intending to defect to 


Red. However, a perusal of the adjectives 
used to describe the partner indicates that this 
latter way of viewing the partner was very 
unlikely. The five adjectives most frequently 
Checked as being descriptive of the partner 
Were: persistent, conservative, dependable, 
naive, and generous. Those most infrequently 
Checked were: treacherous, tolerant, tactful, 
neat, and greedy. x 

It would seem, then, that the disposition 
to act more equitably under conditions of 
confrontation serves to enhance one's public 
image. In the no-confrontation condition what- 
ever guilt or embarrassment is aroused is 
largely personal and not in itself sufficient to 
inhibit exploitation. This result is reminiscent 
of the findings in the area of conformity and 
Social influence which indicate that behavioral 
Conformity is enhanced under conditions of 


*All tests of significance reported are two-tailed. 
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public surveillance (Argyle, 1957; Mouton, 
Blake, & Olmstead, 1956). 


EXPERIMENT II 


Experiment I indicated that under circumstances 
in which no information is available regarding the 
person with whom one is bargaining, there is a 
tendency for subjects to be more exploitative when 
they anticipate no postbargaining confrontation. 
One personality dimension which should interact 
significantly with a variation in interaction anticipa- 
tion is egotism versus humility. This consideration 
lent itself to a 2 X 2 factorial design with the cross- 
cutting dimensions of opponent's personality (ego- 
tistical versus humble) and interaction anticipation 
(confrontation versus no confrontation). The pro- 
cedure for this experiment was highly similar to 
that of Experiment I, but may be compared in the 
following respects: 

Subjects. The subjects were 44 Harvard fresh- 
men recruited in the same manner as those used in 
Experiment I. Care was taken to obtain subjects 
not living in the same dormitories as those previ- 
ously used. Eleven subjects were assigned at random 
to each of the four experimental conditions. 

Procedure. Subjects were again seated at the 
partitioned table and told that the experiment dealt 
with various aspects of bargaining and decision 
making. However, in addition, the experimenter in- 
dicated that he was interested in some of the social 
aspects of bargaining and that each subject would be 
asked to describe himself on a set of forms which 
would then be given to his partner to examine. In 
this manner, subjects were told, they would each 
be able to form some sort of impression of the 
person with whom they would be bargaining. 

All subjects were then provided with a 14-item 
questionnaire. Nine of these items were in the form 
of 12-point rating scales anchored at the extremes 
with such phrases as "clear-thinking"-"fuzzy minded," 
and "efficient"-"inefficient," etc. The remaining items 
were taken from the Janis and Field (1959) selí- 
esteem scale. A representative item was, "In general, 
how confident do you feel about your abilities?" 
Subjects answered on a 5-point scale which ranged 
from “very” to “not at all.” Subjects were asked not 
to place their names on the questionnaires. 

Aíter the questionnaires were completed and col- 
lected, each subject was provided with either one 
of two especially prepared questionnaires which 
ostensibly had been filled out by his partner. These 
questionnaires were intended to create either the 
impression that the partner was egotistical or that 
he was extremely humble or modest. The first nine 
items on the questionnaire had earlier been given to 
an independent group of undergraduates. From the 
ratings made by this group it was possible to 
establish modal response patterns for the various 
scales. For the present experiment the questionnaire 
for the egotistical partner was prepared in such a 
way that the scale points checked were always 
closer to the positive end of the scale than the 


210 


modal responses by 2 scale points. On the other hand, 
the self-ratings for the “humble partner" always 
differed from the modal responses by 2 scale points 
toward the negative end of the scale. On the selí- 
esteem measure the egotistical partner always en- 
dorsed the most extreme positive position, while the 
humble partner always endorsed the most negative 
position on each of the five items. 

After the subjects had been given an opportunity 
to read the ratings made by their supposed partners, 
they were given a six-item questionnaire on which 
they were to rate their impression of their partner. 
These ratings were not to be seen by the partner. 
Half of the subject groups were then given the 
same instructions that subjects in the confrontation 
condition had been given in Experiment I. The re- 
maining subject groups received no such instructions. 

The game used in this experiment was identical 
to that used in the initial experiment, and the sub- 
jects again found their partner to be, almost 
persistently cooperative. After the game was com- 
pleted, subjects were given the same six-item ques- 
tionnaire filled out just prior to the game and asked 
to consider again their impressions of their partner, 
The session was terminated in the same manner as 
the initial experiment. 


Results 


Before examining the results regarding sub- 
jects’ bargaining behavior, it is appropriate 
to ask whether the attempt to manipulate the 
subjects’ perception of the partner’s person- 
ality was effective. The first impression rat- 
ings, made by the subjects immediately after 
being exposed to self-ratings supposedly made 
by their partner, provide a direct check on 
the effectiveness of this manipulation. These 
impression ratings were made on a number of 
12-point scales. Two of these scales (“self- 
centered versus humble” and “modest versus 
conceited”) referred specifically to the in- 
tended personality induction. Combining the 
ratings made on these two scales, the mean 
ratings of subjects facing the egotistical part- 
ner were compared by a ¢ test with the mean 
ratings of those expecting to interact with a 
humble partner (z — 22 for both groups). 
With a range of possible scores from 2 to 24 
the former group obtained a mean of 22.05, 
whereas the mean for the latter group was 
only 6.36. The difference between these means 
is highly significant (t= 19.7, p < .00001). 
Quite clearly, the manipulation of perceived 
personality was effective. 
Turning to the major results, it will be 
recalled that an interaction between perceived 
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personality and expectancy of interaction was „| 
predicted. More specifically, it was hy- 
pothesized that when expecting a postexperi- 
mental confrontation, subjects would exploit | 
the egotistical partner more than the humble 
partner, but that when no further interaction 
was anticipated the reverse would be true. The 
dependent variable in Experiment II (as in 
Experiment I) ‘was the number of exploita- 
tive (Red) choices made by subjects over the | 
30 trials. Figure 1 presents the mean number 
of Red choices made in the four conditions, 
while Table 1 contains the results of an 
analysis of variance performed on this data. | 
As can be seen in Table 1, there were no sig- 
nificant main effects due to either the per- 
ceived personality or the confrontation vari- 
ables. As anticipated, however, there is a 
significant personality confrontation inter- 
action ( < .05). Consulting Figure 1, it can 
be seen that the configuration of the means 
lends full support to the major hypothesis. 
Subjects cooperated more with a humble 
person they expected to meet than with one 
they did not expect to meet; when the partner 
was an egotist, however; there was greater 
cooperation if they did not expect to interact 
with him. 
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Fic. 1. Mean number of Red (exploitative) choices 
for subjects in Experiment II. 


EXPECTATION AND INTERPERSONAL BARGAINING 


TABLE 1 


ANALYSIS OF VARIANCE OF EXPLOITATIVE 
CHOICES IN EXPERIMENT II 


Source + df MS F 
Confrontation (A) 1 1:2t «1 
Partner personality (B) 1 73.05 2.12 
AXB 1 158.07 4.58* 
Error 40 ,9448 

*p «.05. " jsp 


An analysis of the pre-post change scores 
of subjects! perceptions of the partner re- 
vealed several additional facts. These six 
rating scales were arranged so as to form 
three distinct perceptual clusters: egotism, 
independence, and likability. An analysis of 
the prescores on these three dimensions in- 
dicated no significant differences between 
confrontation and no-confrontation subjects 
within either condition of perceived person- 
ality. In other words, as would be hoped, 
there were no systematic differences in the 
way the partner was seen at the outset of 
the bargaining task which resulted from the 
variation in confrontation. Second, it was also 
found that in addition to seeing their partner 
as more egotistical (as discussed above), sub- 
jects facing the egotistical partner also saw 
him as significantly (p «.05) less likable 

». prior to the bargaining task. This finding con- 
firms those of Pepitone (1964). 

Finally, whereas the perception of the hum- 
ble partner was found to be virtually unal- 
tered as a result of the bargaining experience, 
such was not the case with the egotistical 
partner. As a result of bargaining with the 
egotistical partner, subjects came to see him 
as significantly (p< .025) less egotistical, 
less independent (p < .025), and more likable 
(p < .05). In short, it seemed that regardless 
of confrontation, the predominantly coopera- 
tive behavior of the partner in this condition 
was seen as inconsistent with the personality 
traits ascribed to him initially. The coopera- 
tive behavior, in effect, appeared to alter the 
Subjects’ perception so that at the termina- 
tion of the experiment the perceived dif- 
ferences between the egotistical and humble 
Partner was much less marked (though still 
Significant at beyond the .01 level with re- 
gard to perceived egotism). One might be led 
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to speculate that had the game gone on in- 
definitely, the differences in exploitation due 
to initial perceived personality might have 
eventually washed out. 


DISCUSSION 


The results of Experiment I indicate quite 
clearly that when no personal information is 
available concerning the person with whom 
one is bargaining, and this other plays a 
predominantly cooperative game, there is a 
greater tendency to exploit this other when 
not expecting a later confrontation. The find- 
ings from Experiment II, however, modify 
this picture substantially. When personal in- 
formation becomes available regarding the 
other, this information may indeed reverse 
the role of the confrontation variable. Spe- 
cifically, when the other person is perceived 
to be egotistical and self-centered, he is more 
likely to be exploited when future interaction 
is anticipated than when it is not. 

The prediction of this latter finding was 
based on the assumption that a boastful other 
tends to force those with whom he interacts 
into an undesirable low status position. This 
state of affairs can be described as an im- 
balance in displayed power, and may be re- 
acted to by attempts to redress the imbalance, 
In the Gergen and Wishnov (1965) study, for 
example, subjects faced with an egotist be- 
gan describing themselves in an extremely 
positive manner and would not reveal short- 
comings. However, it would appear that the 
need to redress an imbalance in displayed 
power is less powerful when the interaction is 
seen to be short-lived. The expectancy of be- 
ing confronted by an egotist who may have 
been very exploitative would seem to be an 
intimidating experience, demanding the de- 
fensive reaction of exploitation. In line with 
this speculation, there was a tendency for fe- 
male subjects in the Gergen and Wishnov 
study to magnify their positive features to a 
greater extent when future interaction with 
the egotist was anticipated. The male sub- 
jects in the present experiment seem to have 
demonstrated a similar tendency to an even 
greater extent through their differential ex- 
ploitation of the egotist as a function of their 
expectancy of confrontation. 
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Turning to the subjects' reactions to the 
humble partner, interpretation of the results 
raises certain difficulties. It was initially 
speculated that the self-effacing partner would 
instigate two opposing tendencies for sub- 
jects: the humility of the partner would cause 
them to feel embarrassed if they engaged in 
exploitation, and yet this same humility would 
increase the likelihood that monetary rewards 
could be obtained through exploitation. It was 
hypothesized that the variation in confronta- 
tion would split these tendencies in such a 
way that embarrassment would become more 
salient when interaction was anticipated, and 
exploitation would become more attractive 
when there was no possibility of being held 
responsible for one’s actions. 

Some check on these speculations can be 
obtained by comparing the amount of ex- 
ploitation of the humble partner in Experi- 
ment IT with the degree to which the partner 
about whom nothing was known in Experi- 
ment I was exploited. In Experiment IT all 
the factors operating to produce the results of 
Experiment I should be present in addition 
to those created by the knowledge of the 
partner’s humility. If the above speculations 
are correct, in Experiment II there should 
have been more exploitation in the no-con- 
frontation condition and less in the confronta- 
tion condition than found in Experiment I. 
Consulting the results of the two experi- 
ments, it is found that these expectations are 
only partially verified. In the confrontation 
conditions the results were in the anticipated 
direction; when the partner was seen as hum- 
ble the mean number of Red button presses 
was 18.9, whereas it was 19.1 when there was 
no personal information available. On the 
other hand, in the no-confrontation condition 
when the partner was seen as humble the 
mean was 22.9; when there was no informa- 
tion available it was 24.5. The indication is 
that the humble partner tended to elicit fewer 
exploitative choices regardless of expectancy 
of confrontation. Although this tendency does 
not reach statistical significance, it does cast 
some doubts on the speculation that the hum- 
ble other invites exploitation when he does 
not have to be confronted. If anything, the 
results seem to suggest that there is a gen- 
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eralized pity felt for the humble other, and 
this feeling manifests itself across conditions 
These results have some parallel in the Gergen 
and Wishnov (1965) finding that subjects 
would, out of commiseration, reveal more of 
their negative features to a self-derogating 
other regardless of whether they expected 
long-term relationship or not. 

One might raise questions about the rela- 
tionship of this study, as well as others done 
in a bargaining context, to more general 
forms of human behavior. Is the two-person 
game, in other words, not too rarefied for the 
results to be translated into useful and cogent 
principles of social interaction? For those 
who believe bargaining behavior to be suffi- 
ciently intriguing in and of itself, this ques- 
tion may be of little moment. However, one 
of the major intents of the present study wa: 
to highlight the relationship between pure 
experimental games and more broadly per- 
vasive social factors. In one sense, the ex- 
perimental game thus becomes a useful ve- 
hicle for testing out ideas concerning social. 
interaction. Regarding the present experi- 
ment, it is not difficult to think of social 
situations which have conceptual similarities; 
One might be led to predict, for example, the 
reactions of peers in a business organization 
to each other. In such organizations one i$ 
often faced with the choice of competitive, 
exploitation versus cooperation for mutual 
benefit. The present experiment would sug- 
gest at least two important factors, namely, 
perceived personality and amount of antici- 
pated interaction, which would determine 
which of these choices would be made.  ' 
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The purpose of the study was to compare the decision-making performances of 
established and ad hoc groups under conditions of high and low substantive 
conflict. The results indicated that established groups were significantly superior 
to ad hoc groups in decision performance relative to several criteria; The group 
processes for handling conflict, as revealed in an analysis of emergent solution 
products, also seemed to differ in that ad hoc groups were likely to resolve 
differences through compromise procedures whereas established groups re- 
sponded with increased creativity. Moreover, the data indicated that ad hoc 
groups were systematically limited by the quality of their prediscussion member 
resources while established groups were not so limited. The importance of the 


group tradition variable in the search for principles of group functioning is 


stressed. 4 


A few years ago Lorge, Fox, Davitz, and 
Brenner (1958) called attention to an impor- 
tant methodological issue which emerged from 
their review of group decision-making re- 
search. They found that the preponderance 
of investigations had been conducted with 
ad hoc assemblies rather than with actual es- 
tablished groups and that researchers were 
following the potentially misleading practice 
of generalizing principles valid for these 
aggregates of strangers to well-established 
groups whose members had assayed one an- 
other's resources over long periods of time. 
Halpin (1951-52) discussed the same point 
in an earlier review of small group research 
and warned of the risks involved in extrap- 
olating inferences from differing group types 
to one another. The point would seem to be 
well taken, for those who have claimed the 
small group as their research purview are 
particularly sensitive to the significance of 
group dynamics as they stem from and ulti- 
mately affect prolonged interaction among 
group members, The fact remains, however, 
that very few investigators have concerned 
themselves with a direct test of the validity 
of cross-generalization practices. 

Lorge et al. (1958) have put the issue suc- 
cinctly: 

The ad hoc group is treated as a microscopic model 
of the traditioned group. This might be true, but has 
not been experimentally validated [p. 338]. 

This statement, then, is taken as the focus for 
the present study. The purpose of this investi- 


gation will be one of comparing the decision- 
making performances—in terms of several 
outcome and process variables (Brim, Glass, 
Lavin, & Goodman, 1962)—of established 
groups and experimentally created ad hoc 
assemblies composed of functioning execu- 
tives from business and industry. While some 
implicit notions concerning the existence of 
functional differences between the two group 
types are entertained, the investigation will 
proceed along exploratory lines in order to 
achieve a conservative and objective assess- 
ment of the performances of established and 
ad hoc groups. 


EXPERIMENTAL DESIGN 


Setting and Subjects 


The study was conducted in connection with sev- 
eral management training programs and a number 
of short demonstration programs of a similar nature. 
In all, 285 individuals from the ranks of manage- 
ment were utilized in the study and these, in turn, 
provided the membership for some 40 groups ranging 
in size from six to nine members each. Of the 40 
groups studied, 20 were composed of individuals who 
had shared in excess of 50 hours of ingroup activity 
(from a few weeks to several years) and, thus, con- 
stituted established groups. The remaining 20 groups 
served as the ad hoc assemblies for the study since 
their members had had no experience of ingroup C 
tivity prior to the experiment and knew little, if 
anything, about one another. While it was impossible 
to control for group size under the training condi- 
tions which prevailed, the established and ad hoc 
conditions were comparable in that the mean group 
size for established groups was 7.2 members while 
the mean size for ad hoc groups was 7.1 members. 
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Comparison OF ESTABLISHED AND Ap Hoc Groups 


. The 144 subjects comprising the 20 established 
groups and the 141 subjects who served as members 
of the ad hoc groups represented the full range of 
management responsibility and the groups within 
both conditions were equated for management status. 


Task and Procedure 


The task employed was the 22 Angry Men group 
decision-making exercise! As a task involving both 
preinteraction individual judgment$ and an interac- 
tion-based group decision, the 12 Angry Men exer- 
cise involves a number of interdependent judgments 
concerning the behavior of the 12 characters in the 
film, 12 Angry Men? (cf. Hall, Mouton, & Blake, 
1963). Based on a script by Reginald Rose, the film 
concerns the attempts of 12 jurors to reach a unani- 
'mous verdict regarding the guilt or innocence of a 
defendant accused of murder. On a preliminary bal- 
lot, 11 of the jurors wish to return a verdict of 
“guilty,” while the remaining juror votes “not guilty." 
Further deliberation becomes necessary, during which 
each of the jurors reveals something of himself in 
terms of personal prejudices, background, and cog- 
nitive style. After 38 minutes of film time have 
elapsed, a second secret ballot is taken on which one 
of the jurors originally voting "guilty" changes his 
vote, 

The film was halted at this point and the subjects 
were informed that eventually all the jurors initially 
voting “guilty” would gne by one change their votes 
to “not guilty.” As a test of their abilities to assess 
motivations and predict behavior, the subjects were 
asked to predict, first as individuals and then as 
members of a group, the order in which the 11 
jurors would shift to the “not guilty” position. No 
subjects who had previously seen the film were used 


, in the study. 


Criteria of Measurement 


The task affords a number of estimates of group 
functioning. Not only can the adequacy of group 
decisions be evaluated, but indices of creativity, uti- 
lization of member resources, and conflict effects can 
be computed as well. 

Decision adequacy index. Since subjects rank or- 
der the capitulation of jurors, it is possible to com- 
pare individual and group prediction orders with 
the actual order of shifting as it subsequently occurs 
in the film, and the adequacy of each individual’s 
or group’s decision can be assessed in terms of its 
summed deviations from the actual order of capitu- 
lation. This summed absolute deviation represents 
an error score and is inversely related to the ac- 
curacy of prediction orders and, hence, to decision 
adequacy. Thus, the task affords a strictly numerical 


. 1 This exercise was designed as a laboratory teach- 
ing experience by Robert R. Blake and Jane S. 
Mouton. The authors are indebted to them for this 
Contribution to laboratory training. 

* Fonda-Rose Production, released through United 
Artists Studios, New York, April 1957. 
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assessment of decision adequacy, both at the indi- 
vidual and group levels, which can vary from 0 to 
60 points away from absolute accuracy. 

Utilization of resources index. The averaged or 
pooled individual product frequently serves as the 
base line from which group decisions are evaluated 
since most purely statistical benefits resulting from 
the cancellation of individual errors, if they are to 
obtain in group decision making, will be realized 
and controlled for at the moment of pooling (Gurnee, 
1937a, 1937b). Gain or loss in adequacy of the final 
group product over this average individual decision, 
represented by a difference score, may reflect addi- 
tional benefits accruing from interaction and, there- 
fore, serves as an index of the extent to which groups 
have been able to utilize the superior contributions 
of their members. 

Conflict index. Guetzkow and Gyr (1954) have 
suggested that group conflict is composed of two 
conflict components: substantive and affective. The 
focus in this study is on substantive conflict, under 
the assumption that differences of opinion are the 
major precursors to affective conflict in decision- 
making groups. Rather than relying on observers’ 
reports, as has typically been the case in past stud- 
ies, the task itself is used as the basis for assessing 
the presence of conflict potential. In addition to the 
quantification of decision adequacy which the task 
affords, the prediscussion data produced by each 
individual member—when combined with those of 
other group members—allow a sensitive estimate of 
the degree to which members are in agreement and, 
hence, of potential substantive conflict. Through the 
use of the coefficient of concordance (Walker & Lev, 
1953) a value of .00 (minimal) to 1.00 (maximal) 
can be obtained which reflects the amount of agree- 
ment existing among group members prior to group 
discussion. These coefficients can then be employed 
as a general index of the degree of substantive con- 
flict represented in the groups and as raw data in 
assessing the effects of this conflict on group process. 
Thus, low order coefficients denote high conflict and 
high order values denote low conflict conditions. In 
the present study, coefficient values ranged from .28 
to .69 for established groups and from .24 to .83 for 
ad hoc groups. Groups within each condition were 
separated at the median to create high and low con- 
flict categories. 

Creativity-compromise index. Closely associated 
with the issue of substantive conflict are creativity 
and compromise potentials, for each is a likely con- 
sequence of opinion differences. Some estimate of 
creativity-compromise potentials can be made on 
the basis of the frequency with which groups em- 
ploy emergent solutions in their decisions; that is, 
the frequency with which solutions not possessed by 
any member prior to discussion are created by the 
group as a whole and used in the final group de- 
cision in lieu of preexistent individual solutions. 
Since the total decision in the present study is com- 
posed of 11 interdependent judgments, a group could 
conceivably use from 0 to 11 emergent solutions in 
fashioning its decision. Similarly, the quality of these 
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emergent solutions, relative to the average group 
resource for a given judgment or in terms of abso- 
lute accuracy, can be coordinated to group tradi- 
tion as an index of creativity or compromise in the 
handling of conflict by established and ad hoc groups. 


RESULTS AND DISCUSSION 


The data were arranged in a 2 X 2 fac- 
torial design with group tradition and level 
of conflict as the two factors. The means, 
standard deviations, and F ratios for the 
various performance criteria are presented in 
Table 1. Preliminary analyses indicated that 
the established and ad hoc groups were com- 
parable in ability (as reflected by their aver- 
age individual error scores of 22.5 and 23.7, 
respectively) and with regard to the degree 
to which there was a potential for eitHer high 
or low conflict among members. Mean coeffi- 
cients of concordance for high (w = .46) and 
low (w = .64) conflict were identical for the 
two group types. Analyses of variance of these 
two measures resulted in F ratios of less than 
1,00 in each case. Thus, any differences which 
might be found to exist between the two types 
of groups are not attributable to differences 
in available resources or to differential agree- 
ment among members. 


Group Decision Quality 


Many people feel that the ultimate prac- 
tical test of a group’s performance lies in the 
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quality of its final decision product, particu- 
larly in view of the number of man-hours in- 
vested in the group effort. As such, decision 
quality has received the lion's share of atten- 
tion in the group versus individual perform- 
ance controversy. Reference to the decision 
quality scores in Table 1 indicates that dif- 
ferences were found to exist between estab- 
lished and ad hoc groups which may have in- 
teresting implications for research addressing 
this issue. 

The two group types did not differ in the 
amount of time spent in working on their de- 
cision products, nor was time found to be re- 
lated to the quality of individual group prod- 
ucts, summed across group types. They did 
differ significantly, however, in terms of the 
qualities of the decisions they reached. Estab- 
lished groups produced a mean error score of 
13.15, while ad hoc groups produced decisions 
which averaged 16.60 points away from abso- 
lute accuracy. This difference in quality of 
3.45 points yielded an F of 4.12 (df = 1/36) 
which is significant at the .05 level. Thus, 
on the basis of their performance on the 12 
Angry Men task, it would seem that estab- 
lished groups enjoy superior performances 
relative to ad hoc assemblies in terms of de- 
cision outcome. This finding alone, however, 
tells very little about the manner in which 
the two group types arrived at their decisions 


TABLE 1 
Means, STANDARD DEVIATIONS, AND F RATIOS FOR DECISION-MAKING PERFORMANCE CRITERIA 
AVE Freq t Emergent imerient, Grot 
PH ,Av iuency of solutions ti rou! 
Condition | Coefiiclentsof | individual | emergent ^| “lution “versus” | Gain or loss | decisio; 
resource solutions average absolute quality 
resource accuracy 
Established (A) s 
E 22.50 1.40 45 2.50 9.39 1345 
SD 42 3.40 1.19 7 1 i .88 
High conflict (B) Ver 2 UP Y 
À 46 24.52 1.80 1.30 ? 250 11.92 12.60 
SD 42 3.55 1.48 2 5 X .90 
Low conflict (Bz) " is ees EA i 
P, 6 20.57 1.00 —40 2.50 6.87 13.70 
SD «02 1.81 :67 1,5 1 i 67 
A "x (Ay us 6 2.51 2.32 2 
j y 23.70 2.40 ~.65 6.00 7.08 16.60 
SD ae 3.87 2.04 3.12 y 64 
High conflict (Bi) s ses je 
A X 25.18 3.20 —1.50 7.40 18,60 
SD .09 2.76 1.48 324 4.58 $19 7.35 
Low conflict (Bs) PA zh ^ $ 
;; d .60. .20 4.60 14,60 
SD 109 441 227 6.16 8.11 Io 5.50 
F ratios 
A1 versus Ae ns ns 4.02 ns 4.97* as 4.12* 
Bi versus Bs 40.77 10.93** 5.79% ns ns ns ns 
xB ns ns ns 17,25 18.08 4.56* A 
B 
*p «.05. 
**5 < .005. 
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.and what proportion, if any, of the outcome 


variance may be attributed to the groups’ in- 
teraction processes. As Brim et al. (1962) 
have suggested, in inferring that the quality 
of a final decision is an index of the effective- 
ness of the group process one is hard pressed 
to specify what outcomes reflect either good 
or bad processes, For example, an assessment 
of the manner in which groups utilize avail- 
able resources may réveal more about the 
processes employed than the relative quali- 
ties of decisions do. 


_ Utilization of Group Resources 


The adequacy of the average individual de- 
cision for each group, while not an index of 
group performance per se, represents a meas- 
ure of central tendency in the group with re- 
Spect to member decision adequacy prior to 
discussion. As such, it is commonly used as 
a base line for determining available resources. 
Improvements over this base line on the part 
of the final group decision would seem to re- 
flect the manner in which a group is able to 
utilize its available resources during group in- 
teraction, above and beyond any purely sta- 
tistical effects achieved by the cancellation of 
individual error. Thus, the amount of gain or 
loss in adequacy represented by the final de- 
cision as compared with the average indi- 


' vidual error score may serve as one indicant 


of the effectiveness of a group's face-to-face 
interaction in identifying and adopting the 
better ideas present among its members. 

‘The use of gain or loss measures, as re- 
flected in a difference score corrected for nega- 
tive values, has also typically been found in 
Studies dealing with the question of whether 
Or not groups are superior to individuals in 
decision-making performance. In the present 
Study, groups improved significantly over their 
average individual error scores (F — 27.82, 
dj = 1/72, p < .005); but the issue of pri- 
Mary interest is to what extent established 
and ad hoc groups differ in their capacities for 
improvement. 

Comparisons of the mean gains in adequacy 
experienced by the two group types indicated 
that, while established groups gained more 
than ad hoc groups (9.40 and 7.08 points, 
respectively), the groups were not signifi- 
cantly different in this respect (F = 2.67). 
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As the summary of F ratios in Table 1 shows, 
however, the Tradition x Conflict interaction 
was significant (F = 4.56, p < .05), indicat- 
ing that established groups tended to improve 
more when there was conflict among mem- 
bers than when there was fairly close agree- 
ment, while ad hoc group gains did not differ 
significantly as a function of level of con- 
flict. Under conditions of high conflict, estab- 
lished groups improved an average of 11.92 
points over their average individual error 
scores while under low conflict conditions 
they experienced an average gain of 6.87 
points. Ad hoc groups, on the other hand, 
averaged 6.58 points improvement under high 
conflict and 7.58 points under low conflict 
conditions. A comparison of these mean gains 
is presented graphically in Figure 1. This 
significant interaction effect suggests that the 
two group types may differ in their methods 
of conflict resolution. Support for this infer- 
ence may be found in the assessment of con- 
flict effects presented in the following section. 


Effects of Substantive Conflict 


As was indicated earlier, the amount of 
opinion homogeneity, as reflected in coeffi- 
cients of concordance for the various groups, 
was taken as an index of the degree to which 
a group's discussion would potentially be 
characterized by substantive conflict, It 
should be pointed out that no attempt was 
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Fic. 1. Mean gain-loss scores for established and 
ad hoc groups as a function of level of substantive 
conflict. 
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made to verify inferences of high or low con- 
flict via the observer report method, since 
groups worked under autonomous conditions. 
Rather, assessments of conflict potential were 
made directly from the data. Doubtlessly 
there were instances in which potential con- 
flict never became manifest and, conversely, 
there must have been periods of open conflict 
which were not suggested by the group data. 
Despite this methodological flaw, some inter- 
esting relationships involving conflict effects 
were found to exist. For example, it will be 
recalled that established groups were found 
to improve more over their average error 
Scores when there was conflict than when 
there was none, while ad hoc groups displayed 
little differential performance, in terms of 
gain-loss criteria, under different conflict con- 
ditions. Some insight into this difference may 
be gleaned from the manner in which the 
groups employed emergent solutions in reach- 
ing their decisions, as opposed to relying on 
existing resources. 

Emergent products have been interpreted in 
this investigation as reflecting group reactions 
to conflict or, more specifically, as a means of 
resolving conflicts based on opinion dead- 
locks. A comparison of the mean frequencies 
for high and low conflict groups (2.5 and 1.3 
solutions per group, respectively, as shown in 
Table 1) yielded an F of 5.79 (df = 1/36, 
P < .05), thereby lending some support to 
this notion. At the same time, there was a 
tendency for ad hoc groups to employ emer- 
gent solutions with greater frequency than 
established groups. On the basis of 11 pos- 
sible emergent products per decision, the data 
reveal that on the average 23% of the items 
in the final decisions of ad hoc groups were 
emergent solutions as compared to only 12% 
of the items in the final decisions of estab- 
lished groups. Ad hoc groups produced an 
average of 2.4 such solutions per group, as 
compared with the mean per group output of 
1.4 for established groups. The difference be- 
tween these means produced an F of 4.02, 
which falls just short of the value needed for 
significance at the .05 level. Since the two 
group types were comparable in the degree 
to which they held the potential for substan- 

tive conflict, the tendencies revealed by the 
data may be symptomatic of a more basic 
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predisposition among members of ad hoc 4 
groups; namely, to avoid conflict before ii 
arises by producing neutral emergent products 
which are devoid of member vested interest; 
This predisposition, if indeed it does exist, 
should be revealed in the quality of emergen 
solutions produced and this, in turn, would 
seem to be a function of the groups’ capacity 
for either creativity or compromise. 


Creativity and Compromise Reactions to 
Conflict 

It was suggested earlier that two equally 
likely consequences of substantive conflict © 
are creativity and compromise. Each of these 
terms is used to imply something about the 
quality of the emergent solutions produced. 
by groups. Specifically, it is suggested that 
evidence of group creativity exists when emer- 
gent products are more adequate than the 
average individual resource, while emergent 
products resulting in judgments less accurate: 
than those of the average resource are inter- 
preted as being symptomatic of compromise: 
Should the two group types differ in terms of 
these reactions to conflict, many of the difi 
ferences mentioned previously may become 
more meaningful. 1 

Of the 20 groups in each group tradition; 
16 established groups and 14 ad hoc groups; 
produced emergent solutions, with the pre- 
ponderance of the ad hoc groups falling in 
the high conflict condition. On a one-by-one. 
basis, the quality of these emergent solutions 
did not differ significantly. Established groups 
tended to produce solutions which were st 
perior to their average resources (.48 point) 
and ad hoc groups employed solutions less 
accurate (—.84), but the mean difference of 
1.32 points yielded a ¢ of less than 1.00. It 
will be recalled, however, that ad hoc groups 
produced almost twice as many emergent 
solutions than established groups did (albeit 
with fewer groups) and this would suggest 
that the cumulative effect of a group's emet- 
gent solutions might be a more realistic basis 
for assessing a group’s capacity for either 
creativity or compromise in fashioning a total 
group decision. q 

Therefore, the quality of emergent solu- 
tions was assessed in two ways. The first 
dealt with a difference score which reflected 
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the degree to which the combined emergent 
solutions for each group were more or less 
accurate than the average individual judg- 
ments available for use. The second assess- 
ment dealt with the combined absolute devia- 
tions of emergent solutions from the correct 
judgment on the task. This, of course, 
amounted to an error score for each group 
which was based only on entergent products. 
Since not all groups in either condition pro- 
duced emergent decisions, the data were ana- 
lyzed according to Snedecor's (1962) analy- 
sis of variance method of unequal entries per 
cell, with mean squares corrected for dispro- 


` portionality. 


No main effects were found to be signifi- 
cant in the first analysis. A significant inter- 
action between group tradition and degree of 
conflict was obtained, however, which further 
explicates performance differences between es- 
tablished and ad hoc groups. As reference to 
Figure 2 will indicate, established groups un- 
der high conflict conditions produced com- 
bined emergent products which were on the 
average 1.3 points superior to their average 
individual judgmefits, while ad hoc groups 
under the same condition produced emergent 
solutions which were an average of 1.5 points 
inferior to their average resources. Con- 
versely, established groups low in conflict 
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Fic, 2. The effects of conflict on the quality of emer- 
gent solutions relative to average resources. 
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incorporated emergent products which were 
an average of .4 point less adequate than the 
average individual judgment and ad hoc 
groups produced solutions which surpassed 
their average resources by a mean of .2 point. 
These differences resulted in a significant 
Tradition X Conflict F of 17.25 (df = 1/26, 
b < .005), suggesting that established groups 
react to substantive conflict with increased 
creativity while ad hoc groups resolve their 
conflicts of opinion via a compromising proc- 
ess. The analysis of emergent solution error 
scores serves to underscore this interpretation. 

The analysis of error scores based on the 
combined emergent solutions for each group 
revealed that established groups not only pro- 
duce emergent solutions which are cumula- 
tively more accurate than those produced by 
ad hoc groups (an average of 2.5 points and 
6.0 points away from absolute accuracy, re- 
spectively), but they do so without regard to 
degree of conflict. While established group 
emergent solutions deviated from absolute ac- 
curacy an average of 2.5 points under both 
high and low conflict conditions, ad hoc prod- 
ucts reflected a mean deviation of 7.4 points 
under high conflict conditions and an average 
of 4.6 points under low conflict. This inter- 
action is depicted in Figure 3. The differences 
between these weighted means produced a 
main effect F of 4.97 (df = 1/26, p < .05) 
and an interaction-based F of 18.08 (p< 
.005). Thus both tradition and conflict effects 
are evident in the evaluation of emergent 
solutions and these, in turn, would seem to 
denote differential capacities for creativity 
and/or compromise on the part of established 
and ad hoc groups, particularly with reference 
to high conflict conditions. 


Relationships among Performance Criteria 
and Their Implications 


To the extent that group performance vari- 
ables covary and to the extent that estab- 
lished and ad hoc groups differ with reference 
to the interrelationships of these variables, it 
would seem that the researcher’s task is made 
more difficult. Such a state of affairs will re- 
quire a more diligent search for the individual 
principles to which established and ad hoc 
groups adhere, and a more precise specifica- 
tion by investigators of experimental condi- 
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Fic. 3. Mean emergent solution error scores for 
established and ad hoc groups as a function of sub- 
stantive conflict. 


tions and the implications of their findings. 
In the present study, several significant dif- 
ferences between group types were obtained 
with regard to covarying performance criteria. 

Relationships among performance criteria. 
In addition to the analyses of variance re- 
ported earlier, the data were also subjected 
to correlational analyses in an attempt to pull 
out significant relationships. The coefficients 
obtained for the two group types were then 
tested by the z' test of significant differences. 


A summary of the intercorrelations obtained 
is presented in Table 2. 

Reference to Table 2 suggests some inter- 
esting interplays among performance vari- 
ables. Decision quality, for example, may be 
seen to be negatively related to conflict in 
ad hoc groups (r = — .55, p < .02), but vir- 
tually uncorrelated with this process variable 
in established groups (r= .03). While the 
difference between these two coefficients is 
not significant for a two-tailed test (z= 
1.83), it points to a potentially important im- 
plication of conflict for decision quality in the 
two types of groups. Similarly, a significant 
difference (z = 2.36, p < .02) between the 
correlation for decision quality and the fre- 
quency with which emergent solutions were 
employed by established and ad hoc groups 
was obtained. Among ad hoc groups the cor- 
relation was negligible (r= — .32), but 
among established groups the relationship 
was both positive and significant (r = .44, 
P < .05), thus reinforcing the earlier infer- 
ence that established groups make better use 
of their emergent solutions than ad hoc groups 
do. Finally, important differences were found 
between established and ad hoc groups with 
reference to the relationship of group decision 
quality to average individual decision quality. 
Farnsworth and Williams (1936) have sug- 
gested that there is little reason to expect the 
group product to be any better than that of 
the individuals comprising the group. It might 
be expected, therefore, that a strong positive 


TABLE 2 


SUMMARY OF CORRELATIONS AMONG DECISION-MAKING CRITERIA FOR ESTABLISHED AND 
Ap Hoc Groups 


i G: di * idual 
Vas i uM Ree or COM MESE" 
Substantive conflict —.553* —.182 .393 —.711** 
.031 -536* .137 —.770** 
Group decision quality 817 —.324 110" 
469* -440* .018 
Gain —.204 262 
413 —.653** 
Frequency of emergent solutions —311 
—.M6 


Note. os orrelation coefficients for established groups appear in boldface below those for ad hoc groups. 


*p <0 
25 So 
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relationship would be evidenced between a 
measure of group resources (the average in- 
dividual error score) and the quality of the 
final group decision. The data support such 
an expectation in the case of ad hoc groups 
(r= .77, p < .01), but fail to do so in the 
case of established groups (r= .02). This 
difference (z = 2.94, p < .003) suggests that 
ad hoc groups may be systeniatically limited 
in the production of group decisions to the 
quality of their prediscussion member re- 
Sources while established groups experience 
no such restrictions. Thus, even when initial 
decision ability is controlled for, generaliza- 
tions resulting from research employing one 
or the other type of group are in need of 
qualification. 
It will be noted in Table 2 that the use of 
. resources (as revealed in gain-loss scores) 
also varied with other variables. Gain and 
conflict were positively related for established 
groups (r—.54, p < .02), while there was 
no systematic relationship for ad hoc groups 
(r= — 18). As would be expected on the 
basis of the significant Tradition x Conflict 
interaction for gain reported earlier, the z' test 
of this difference yielded a z of 2.31 (p < .02). 
Similarly, a less direct conflict effect ap- 
proached significance. The role of emergent 
solutions in fashioning increments of gain dif- 
* fered somewhat for established and ad hoc 
, 8toups (7’s equaling .41 and —.20, respec- 
tively), yielding a z of 1.88 which falls just 
Short of the value required for significance 
at the .05 level. Finally, the relationship be- 
tween the quality of prediscussion resources 
and amount of improvement constituted a 
point of significant departure between estab- 
lished and ad hoc groups. This relationship 
would seem to be an important factor in 
8roup decision making since it gives some 
indication of what groups with varying quali- 
ties of member resources are likely to do 
With those resources and to what extent im- 
provement is contingent upon resource quality. 
Comparisons of the obtained r's for the two 
group types show that a negligible relation- 
Ship existed between the two variables in ad 
hoc groups (r = .26) while established groups, 
9n the other hand, displayed a significant 
tendency to improve most when their avail- 
able resources were inaccurate to begin with 
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(r= — .65, p < .01). The difference between 
these correlations yielded a z of 3.05, signifi- 
cant beyond the .002 level. Thus, the data 
suggest some critical differences in the manner 
in which established and ad hoc groups uti- 
lized their intragroup resources in the pres- 
ent study. 

Implications of obtained results. At least on 
the basis of present results, it can be said that 
established and ad hoc groups differ in their 
approaches to decision making. If, as Lorge 
et al. (1958) have suggested, the two group 
types adhere to different principles, a search 
for such principles is in order. The present 
investigation would seem to suggest at least 
one. A review of the results obtained indi- 
cates that of the differences between estab- 
lished and ad hoc group performances reach- 
ing significance, roughly 60% involved either 
pure conflict effects or conflict in interaction 
with group tradition. Thus, in view of the 
interpretations of differences offered earlier, 
it appears that ad hoc groups respond to con- 
flict in a manner designed to bring about 
compromise and, thereby, to short-circuit dis- 
agreements. Established groups, on the other 
hand, seem to view conflict as symptomatic 
of unresolved issues and adopt procedures 
designed to bring about constructive resolu- 
tion of differences. Therefore, it might be in- 
ferred that established group creativity re- 
flects an ability to treat conflict objectively 
and as problem oriented, while ad hoc com- 
promise reflects a tendency to view conflict 
among strangers as having potential affective 
consequences which preempt the importance 
of the task. If this is so, there are implica- 
tions for not only decision quality, but for 
commitment issues as well. It might be ex- 
pected that group members who compromise 
their own positions in order to avoid conflict 
and thereby preserve their deindividuation 
will not be as committed to the final group 
product as those members who work through 
conflicts more creatively. This is an easily 
testable hypothesis and points toward future 
research with established versus ad hoc 
groups. 

At the same time, it will be recalled that 
established and ad hoc groups did not differ 
particularly with reference to their perform- 
ances under low conflict conditions and this 
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fact may have implications for group perform- 
ance based on initial agreement among mem- 
bers. The performance of both group types 
showed generally less creativity and/or com- 
promise than obtained under high conflict 
(this would be expected under the assumption 
that both are reactions to conflict) and some 
type of convergence process seemed to pre- 
vail. Kelley and Thibaut (1954) have sug- 
gested that individuals are likely to reexamine 
their opinions only when it becomes obvious 
that they differ from those of others. Thus, 
the lack of differences which characterizes a 
low conflict condition may be interpreted as 
the validation of one's opinions which serves 
as a signal that it is not necessary to look 
further for a more correct solution. Ìt would 
appear that lack of conflict attenuates group 
performance somewhat in both established 
and ad hoc groups with the effect that the 
groups' fullest potential is not realized. 
Obviously, these generalizations are in need 
of qualification to the extent that executive 
groups are not representative of the general 
population of decision makers and in terms 
of the laboratory quality of the experimental 
setting and task employed. On the other hand, 
these results may have practical implications 
in that they may offer some insight into why 
some decision makers in practical settings 
have success with groups while others have 
failure experiences. The use of ad hoc com- 
mittees for problems requiring creativity 
would now seem to be a questionable practice. 
Similarly, they raise theoretical questions. 
For example, with respect to the issue of the 
effects of minority opinions: do they always 
upgrade task accomplishment, as Torrance 
(1957) has suggested, or can their expression 
lead the group in any direction so long as 
they are positively voiced, as suggested by 
Shaw and Penrod (1962)? It would appear 
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are in need of reevaluation within specified 
group contexts if any general principles of 
group decision making are to be discovered, 
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that these and other effects are dependent 4 
upon the type of group studied and, therefore, 
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e 
40 college Ss were assigned to a 2 X 2 X 2 design composed of Ss’ test anxiety 
: level, Ss’ sé, and reinforcement. Predictions were made concerning the effects 
of reinforcement upon the frequency of negative and positive academic self- 


f statements (NAS and PAS) in a free verbalization session. Predictions were 
D also made concerning the effects of reinforcement on postexperimentally ad- 


ministered anxiety scales relative to preexperimental scale scores. Reinforce- 


ment significantly influenced NAS but did not affect PAS. Reinforcement for 


NAS had the general effect of reducing anxiety scores but had relatively little 
effect on scores from Ss reinforced for PAS. 


Studies of verbal conditioning are often re- 
garded as laboratory parallels of the psycho- 
therapeutic process. Vet, if such studies are to 
provide a paradigm for examining psychotherapy 
‘it is significant that they have, with rare excep- 
tions, omitted before and after measures of 
personality as indices of change. More com- 
monly, verbal conditioning ‘has been investigated 
as an end in itself, as the dependent variable, 
While numerous indefendent factors have been 
varied, 

The present investigation had two goals. The 
first was to determine the feasibility and efficacy 
of verbally reinforcing, differentially within a 
free verbalization period, negative and positive 

“aspects of the same response class: academic 
.Self-statements, These statements were defined 
to include personal self-references discriminably 
Positive or negative with respect to academic 
traits and experiences, The second and major 
goal was to determine the extent to which verbal 
reinforcement for the two response classes would 
affect postexperimental scores on two scales 
intended to measure anxiety. 


Predictions 


It is hypothesized in the present experiment 
that if a relatively high degree of a priori simi- 
larity exists between the contingencies for re- 
inforcement and the pencil-and-paper measures 
Of anxiety that generalization will occur. It is 
Specifically hypothesized from an operant condi- 
tioning model that reinforcement for positive 


1 This paper is based on a thesis submitted in 
Partial fulfillment of the requirements for the degree 
Of Master of Science at the University of Washing- 
ton. The writer is indebted to his sponsor, Irwin G. 
Sarason, and to the members of his committee, 
Donald M. Baer and Ezra Stotland. 
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academic self-statements will result in lowered 
(more positive) anxiety scores while reinforce- 
ment for negative academic self-statements will 
result in heightened (more negative) anxiety 
scores. Similarly, it was predicted that reinforce- 
ment for positive statements will lead to an in- 
crement of that class, while reinforcement for 
negative statements will result in an increment of 
that class. With respect to the latter prediction 
the literature indicates that success has been 
achieved in reinforcing statements of a negative 
quality while positive statements are somewhat 
less amenable to the usual verbal reinforcements 
(Rogers, 1960; Sarason & Ganzer, 1962, 1963). 


METHOD 


Subjects 


Subjects were 20 men and 20 women from an 
introductory psychology class at the University of 
Washington. From two classes totaling 206 students 
the 40 subjects were selected on the basis of their 
scores on the Test Anxiety Scale (TAS; Sarason, 
1958), a 17-item scale which was embedded in the 
Autobiographical Survey (an inventory including, 
among others, the General Anxiety Scale). This sur- 
vey was administered early in the same quarter in 
which the study was carried out. Because TAS 
scores of males and females were distributed in 
similar fashion, subjects were selected within the 
same scale score cutoff points. The cutoff points for 
high and low TAS subjects were 9, 16 and 1, 3, 
respectively. In order to allow some variance in both 
directions on the retesting of these subjects, absolute 
bottom and ceiling scores were discarded from the 
populations which were sampled. The cutoff points 
represent, approximately, the upper and lower 3076 
of the total distribution. The correlation between 
the General Anxiety Scale (GAS) and TAS for the 
above population was r= .58. The experiment took 
place over the course of a 7-week period. The vari- 
ance in time delay between original testing and the 
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combined verbal reinforcement session and post- 
experimental testing was evenly distributed through- 
out the sample. 

In the resulting 2X 23 2 factorial design high 
and low TAS scores formed one factor, sex of the 
subjects a second, and the random assignment of 
equal numbers of subjects within the above factors 
to conditions of reinforcement for negative or posi- 
tive academic self-statements constituted the third. 
The second anxiety measure, GAS, did not consti- 
tute a factor but was administered before and after 
the verbalization session like the TAS to determine 
the effects of reinforcement upon it. All subjects 
were asked to talk about their "academic person- 
ality" for a period of 24 minutes. During the last 20 
minutes of this talk 10076 reinforcement was given 
for one class of response or the other. The reinforce- 
ment used was the examiner's verbal “um-hum.” 


Materials 


Equipment in the experimental room included a 
tape recorder and a signaling device manipulated 
by the examiner, two chairs, and a table desk. The 
tape recorder was located on the opposite side of the 
desk from the subject and out of his line of vision. 
The signaling device served to introduce a readily 
distinguishable tone on the recording and was used 
to mark the end of each 4-minute period and to 
indicate when the examiner verbally reinforced the 
subject. 


Instructions 


The subject was greeted in a friendly way and 
taken immediately to the room described above. 
The examiner and the subject sat facing each other 
across the desk. At this point the examiner sought 
to put the subject at ease by asking him some 
innocuous questions such as age, major, class, etc. 
Every subject was then given the following instruc- 
tions (in abridged but essential form): 


You've probably noticed that everyone talks 
about college students and what they're like, but 
few people try to find out anything from the 
students themselves. A lot of time is spent on 
opinion and attitude surveys to gather information 
of students’ ideas concerning current issues—but 
this is not what I’m interested in finding out 
about. I am interested in getting an idea of what 
students think and feel about themselves as stu- 
dents. I want to hear how the various aspects of 
your personality affect your classroom work—on 
exams, term papers, studying, etc. I would like 
to know also whether you feel that you are doing 
as well in school as you might and what sort of 
things about yourself help or hinder your progress. 
In a manner of speaking you will be looking at 
yourself carefully for this short time and evalu- 
ating your academic personality and the various 
things which you feel are relevant to it. I will 
refrain from asking you any questions and you 
will talk at your own pace—and remember— 
there's no need to be worried about long pauses 
or repetitions. Go ahead. 
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At the close of the “interview” the second adminis- 
tration of the TAS and the GAS was completed 
after the following instructions were given: 


Okay, that's all there is to it. Oh, there is one 
more thing—I’ve been asked to give this question- 
naire to all the people I talk with here. If you 
would fill it out to the best of your ability I will 
leave you now. Just put it in that box over there 
when you're dene and close the door when you 
leave. Thank you for helping me with this project 
and, please, don't discus it with anyone. L 


To recapitulate, all subjects were seen individ- 
ually and given a set of instructions intended to 
familiarize them with the interview task. At the 
close of the interview the subjects were asked to 
complete a questionnaire consisting of the Test’ 
Anxiety Scale and the General Anxiety Scale. The 
questionnaire was presented in such a way as to 
reduce the conscious association between it and its 
previous administration. To facilitate this end, the 
questionnaire was presented in a different print and 
format, the questions appeared in a different order | 
than on the original form, and was much shorter | 
than the original, having two scales instead of five. 


Response Criteria 


Academic self-statements were defined on the 
basis of an adaptation of Rogers’ (1960) criteria. 
Four categories were used to rate subjects’ verbali- 
zations. Briefly, these were: 

1. Positive academic self-statements (PAS). These 
statements tend to place the person in a favorable 
light with respect to academic achievement, ability; 
interest, or attitude. They were statements which the 
examiner considered socially desirable for the uni- 
versity undergraduate who places some emphasis on 
good grades, good study habits, a desire to learn, 
and has graduation as a goal (e.g. “I’m very happy 
at the U” or “I don't have to worry about grades"). 

2. Negative academic self-statements (NAS). These 
statements tend to place the person in an unfavorable 
light with respect to academic achievement, ability, 
interest, or attitude (eg. “I lack confidence 0n 
tests,” or “Pm doing a lousy job of studying here"). 

3. Ambiguous academic self-statements (AAS). 
These statements contain some reference to the 
subject’s academic life or attitude but are not 
classifiable as either positive or negative (eg. “ 
guess I do all right in school,” or “You can't get 
ahead without a college degree”). 

4. Other statements (OS). These statements may 
or many not pertain to self but they do not contain 
a reference to the academic self (e.g. “I’m a mis- 
erable person,” or “My parents are happily mat 
ried”). 


Reliability of Verbal Ratings 


Average percentages of agreement were computed 
between the first and second ratings made sever 
months after the first ratings by the examiner on * 
random sample of three subjects. For any one 
subject were four possible categories in which state- 


| 


* 
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ments might be judged to belong and six time 
periods, making a total of 24 cells which would 
account for all statements over time which a sub- 
ject could make. The percentage of agreement was 
found by averaging the 24 percentages resulting from 
a comparison of each cell (on the first rating) with 
its corresponding cell (on the second rating). The 
average percentages obtained with the three sub- 
jects were 78%, 73%, and 79%. The same three 
subjects were then rated by another judge after a 
brief acquaintance with the criteria. The average 


, Percentages of interjudge agreement were 65%, 


53%, and 68%. 


REsULtTS 
Verbal Data 


The results summarized in this section indicate 
the extent to which the experimental variables 
influenced the subjects’ verbal behavior over the 
course of the verbalization session. The content 
analysis of the protocol tapes provided a mass 
of data which was subsequently analyzed in 
many ways. For brevity, however, only the 
simple analyses (not those involving trend effects 
or transformations of the data) of summed raw 
Scores were selected to be presented here. 

Negative academic  self-statements (NAS). 
Statements judged to fall within the NAS cate- 
gory were summed ofer the last 20 minutes of 
the 24-minute sessions for each subject. A simple 
analysis of variance was performed on the NAS 
totals summarized in Table 1. Bearing on the 
predictions made earlier is the significant effect 
for the reinforcement condition. The mean total 

of NAS made by subjects reinforced for emit- 
„ting NAS was 14.45 as compared with a mean 
of 7.85 NAS emitted by subjects reinforced 
for PAS. The difference between the means is 
Significant at the .025 level. 


* TABLE 1 


COMBINED SUMMARIES OF THE ANALYSES OF VARIANCE 
OF NEGATIVE ACADEMIC SELF-STATEMENTS AND 
ANXIETY SCALE CHANGE SCORES 


MS ^ F 
Source df D 
NAS | T NAS | scores 
Anxiety (A) 1 .02 | 72.90 5.53* 
Sex (S) 1 | .32|1440 1.09 
Reinforcement (R) 1 | 87.12 | 67.60 | 6.49** | 5.13* 
AXS 1| 72 | 32.40 2.46 
AXR 1 | 2.40 | 78.40 5.95* 
SXR 1 |19.22| .10|143 
5 XSXR 1 .66| 2.50 
rror 32 | 13.41 | 13.17 
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TABLE 2 


MEAN NUMBERS OF STATEMENTS CLASSIFIED BY 
CONTENT AND REINFORCEMENT CONDITION 


Category R-PAS R-NAS 
PAS 12.15 13.40 
NAS 7.85 14.45 
AAS 42.90 42.50 
Os 11.50 10.85 

Total 74.40 81.20 


Positive academic self-statements (PAS). The 
reinforcement conditions did not act differen- 
tially upon total emission of PAS. Thus, the 
verbal reinforcer used by the examiner was in- 
effective in influencing the subjects’ tendency to 
make positive academic self-statements about 
themselves. The only F reaching significance was 
the third-order interaction which is too complex 
for useful interpretation. The summary of the 
analysis is not reproduced here. 

Analyses of the AAS and OS categories did 
not reveal significant differences or interactions 
and are not reproduced here. Table 2 shows the 
mean raw numbers of statements made during 
the last 20 minutes of the session under the two 
reinforcement conditions. 


Anxiety Score Changes 


Simple, 2X 2 X 2, analyses of variance were 
performed on difference scores obtained from the 
pre- and postexperimental administrations of the 
TAS and the GAS. The difference (D) scores 
were computed for each subject by subtracting 
the preexperimental scores from the post- 
experimental scores, Consequently, a negative D 
score indicates a decrease in reported anxiety 
while a positive D score indicates an increase in 
reported anxiety. 

A factor analysis of personality item endorse- 
ments (Schultz?) suggests that TAS and GAS 
items share common variance. One of the factors 
(clearly an anxiety factor) contained 16 TAS 
items and 9 GAS items with loadings greater 
than +.40. Proceeding on the tentative assump- 
tion that TAS and GAS are, within uncertain 
limits, measures of the same variable, an analy- 
sis was performed on a new set of D scores 
calculated from summed pre- and postexperi- 
mental TAS and GAS scores. 

The analysis, summarized in Table 1, yielded 
significant results generally consistent with the 
separate analyses of TAS and GAS D scores. 
Due to the similarity between the separate TAS 
and GAS D score analyses and the combined D 


2 C. Schultz, personal communication, 1962. 
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TABLE 3 


MEANS Or PRE- AND POSTEXPERIMENTAL ANXIETY 
SCALE SCORES FOR THE A X R INTERACTION 


R-PAS R-NAS 
Pre Post Pre Post 
High test anxious 
TAS 11.0 12.2 12.5 8.9 
GAS 6.9 Tis) 10.7 8.7 
Combined 17.9 17.7 23.2 17.6 
Low test anxious 
TAS 2.6 2.8 1.8 1.6 
GAS 4.0 3.5 3.8 3.9 
Combined 6.6 6.3 5.6 5.5 


Note.—Mean D scores may be computed by subtracting the 
prescore from the postscore, 


score analysis, summaries of the former have 
been omitted, The relevant means for the sepa- 
rate analyses, however, have been included with 
the combined D score means in Table 3. Dis- 
regarding the main effect for anxiety for the 
reason stated earlier, it is interesting that rein- 
forcement emerged as a significant (p < .05) 
effect, The R-PAS condition resulted in a mean 
D score of —.25 while the R-NAS condition 
resulted in a mean D score of —2.85. Thus, the 
R-PAS condition led to a very slight reduction 
in overall anxiety, whereas the R-NAS condi- 
tion led to a relatively great decrease. Of interest 
also is the Anxiety X Reinforcement interaction 
whose means are shown in Table 3. Duncan's 
multiple range test (Edwards, 1960) showed that 
the mean D score for high test anxious subjects 
under the reinforcement for NAS treatment dif- 
fers significantly (p < .01) from all other means. 
It is apparent that high test anxious subjects 
reinforced for the NAS category radically low- 
ered their scores relative to subjects in other 
treatment groups. 


Discussion 


Relationships in accordance with the predic- 
tions were obtained neither between verbal be- 
havior and reinforcement, nor between reinforce- 
ment and anxiety score changes. The single 
predicted effect resulting from reinforcement was 
found in the frequency of negative academic 
self-statements. Under the condition of R-NAS 
a larger number of NAS was elicited than under 
the condition of R-PAS. It is, however, con- 
sistent with other studies (Rogers, 1960; Sarason 
& Ganzer, 1962) that verbal reinforcement for 
positively toned self-statements was relatively 
less effective than reinforcement for negative 


statements. 
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To the extent that the reinforcement trea 
ments carried over, or generalized, to anxiety 
self-report, it is reasoned that the present e 
perimental design provided a greater coi 
spondence between reinforcement contingen 
and personality measures than other stu 
have. In contrast to the present findings, Ri 
(1960) was unable to demonstrate gener: 
tion from the ‘reinforcement conditions to 
series of self-report measures. A study 
Sarason and Ganzer (1963) is similar to 
Rogers study in that little apparent simila; 
of content existed between an interview ti 
and the second administration of some perso 
ality scales. In neither study were substan 
changes in scale scores observed except thos 
which must be attributed to the regression 
phenomenon. , 

In this investigation it is difficult to acco 
for the differences in verbal behavior and 
differences in anxiety score changes by the s 
learning paradigm. With limitations the oper 
paradigm can partially account for the vei 
behavior. Changes in anxiety scores, howe 
did not conform to the expectations generi 
by the operant model and one is forced to loo 
elsewhere for an explanation. i 

It may be fruitful to view the anxiety C 
as having resulted from the process of recipri 
inhibition as extended by Wolpe (1958) to 
count for anxiety reduction in therapy. Accor 
ing to evidence marshaled by Wolpe it is 
sible to inhibit anxiety associated with spe 
classes of stimuli by pairing a verbal presentati 
of the latter with a strong relaxation respons 
His reasoning is that a relaxation response 
incompatible with an anxiety response and consi 
quently inhibits it. With repeated system ti 
presentation of the anxiety cues in the pre! 
of relaxation the cues become “desensitized. 
the present study subjects were encouraged 
nonthreatening (relaxing?) situation to say el 
positive or negative things about their academi 
selves. It is suggested that those subjects 1 
were reinforced for saying negative things 
associated with anxiety) experienced a temp 
reduction of academically generated anxiety 4 
consequently reported themselves as less anx 
when presented with the cues in a different Q 
longer idiosyncratic) form, the pencil-and-p 
anxiety scale items. Conversely, subjects T 
forced for making positive academic self-st 
ments, verbal cues presumably not ass 
with anxiety, desensitization through reci 
inhibition did not occur and self-report of | 
iety level remained substantially the same. * 
it appears that changes in verbalizations 
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lowed the operant conditioning paradigm whereas 
reported anxiety changes corresponded more 
closely to a classical conditioning paradigm. The 
uncertain relationship between the operant and 
classical learning models as they concern human 
behavior has special cogency for future psycho- 
therapy research. 
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RECIPROCITY AND RESPONSIBILITY REACTIONS TO PRIOR HELP' 


RICHARD E. GORANSON anp LEONARD BERKOWITZ 


University of Wisconsin 


The préSent study was conducted to clarify the findings of a previous experi- 
ment which showed that persons who had previously received help themselves 
were more willing to work for a dependent peer than were Ss who had not 
received prior help. Ss were 84 college women given a dull preliminary task to 
perform, A peer (E's confederate) took the initiative in helping à of Ss on this 
task, while she supposedly was instructed by E to give aid in another 4 of the 
cases, and refused to help the remaining Ss. Following this, all Ss were led to 
believe that they were to be “workers” under the guidance of a “supervisor” 
who was represented to 4 of Ss as being the same peer that they had en- 
countered earlier and to the other 4 as a different peer. All Ss were further 
led to believe that their supervisor’s chances of winning a cash prize were 
highly dependent on how hard Ss worked. When the supervisor was the same 
person, Ss worked harder after receiving voluntary help than did Ss who 
received the compulsory help. Ss who had been refused prior help were least 
willing to work for their same-person, dependent supervisor. Differences among 
the 3 help conditions for those Ss working for the different supervisor were not 


significant but the condition means were ordered in the same way. 


Several authors have recently cutlined models 
of social interaction which are essentially eco- 
nomic or utilitarian in nature, Thibaut and Kel- 
ley (1959), for example, have set forth an analy- 
sis of a wide range of social behavior in terms of 
the rewards and costs incurred by the people 
involved, According to this model, each individual 
attempts to maximize the ratio between his own 
rewards and costs, The basic assumption involved 
here is that a person behaves or interacts in a 


1 This research was conducted under Grant GS-21 
from the National Science Foundation to Leonard 
Berkowitz, 


given way because he believes that it is to his 
advantage to do so. The authors apply this analy- 
sis to simple social behavior and also to such 
complex social phenomena as role behavior, con- 
formity to norms, and group leadership. 

In a recent review, Gouldner (1960) has ad- 
dressed himself to the area of reciprocal exchange 
in social relationships. His approach differs some- 
what from that of the utilitarian analysts in that 
it lays emphasis on the normative aspects of 
reciprocity, Gouldner goes so far as to say that 
there is a universal, moral norm of reciprocity 
which makes the minimal demands that 
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(1) people should help those who have helped them, 
and (2) people should not injure those who have 
helped them [p. 171]. 


This normative reciprocity that Gouldner pro- 
poses may violate certain economic principles. 
The author points out that the reciprocity norm 


engenders motives for returning benefits even when 
power difference might invite exploitation. The norm 
thus safeguards powerful people against the tempta- 
tion of their own status; it motivates and regulates 
reciprocity as an exchange pattern, serving to inhibit 
the emergence of exploitative relations . . . [p. 174]. 


It may also be seen that in a short-term en- 
counter it is not always economically reasonable 
for one person to return the help of another per- 
son. Gouldner argues that if it were not for a 
norm of reciprocity it would be unlikely that the 
first person would be willing to extend his help. 


When internalized in both parties [however], the 
norm obliges the one who has first received a benefit 
to repay it at some time; it thus provides some real- 
istic grounds for confidence, in the one who first 
parts with his valuables, that he will be repaid [p. 
177]. 


Some writers have suggested that there may be 
a different kind of noneconomic motive operating 
in some social situations. Berkowitz and Daniels 
(1963) have proposed that persons in our soci- 
ety may be motivated to help others simply 
because those others are dependent on them. 
They suggest further, that people in our society 
generally learn a standard of conduct prescribing 
that they behave in a “socially responsible" 
fashion. That is, among other things, they should 
help those who are dependent on them. Thus, 
people at times act on behalf of others, not for 
material gain or social approval, but for their 
own self-approval, for the self-administered re- 
wards arising from doing what is "right." 

In order to test this notion experimentally in a 
laboratory setting, these authors created a situa- 
tion in which subjects were led to believe that 
they were taking part in a test of supervisory 
ability. The subjects were told that they were to 
be "workers" performing in accordance with in- 
structions from a peer (the “supervisor”). Some 
subjects were told that if they performed well, 
their supervisor could win a prize, while the other 
subjects were told that their supervisor's chances 
did not depend on how hard they worked. The 
results showed that subjects worked hardest 
when their supervisors’ ratings (and chances for 
a prize) were highly dependent on their per- 
formances. Subjects worked hard for their part- 


Brier ARTICLES 


ners even when there was apparently "nothing in 


it” for themselves. 

In a more recent study dealing with socially 
responsible behavior, Berkowitz and Daniels 
(1964) introduced the variable of prior help. All 
subjects in this experiment were required to 
work on a tedious preliminary task, but half of 
them were given help “voluntarily” from a peer 
(the experimenter's confederate posing as a fel- 
low subject), while the other half was given no 
help. Upon completion of the preliminary task, 
the subjects were put through the "supervisory 
ability" phase of the experiment described above, 
with each subject believing that she was the 
worker and her partner the supervisor. 


The results of this experiment indicated that 


the subjects worked hardest for the dependent 
supervisor when they had previously received 
help from the confederate. Berkowitz and Dan- 
iels (1964) suggest two possible explanations for 
this result. One explanation 


is based on the reciprocity principle. The girls helped 
by the experimenter’s confederate conceivably felt 
some obligation to pay her back. Such feelings of 
obligation could have generalized to the dependent 
supervisor so that, in essence, by working hard 
for this latter person they were reciprocating for the 
assistance they had received. If so, the present find- 
ings can be understood as a special case of the 
reciprocity norm. People supposedly live up to their 
social obligations in order to pay back for the good 
turns they had received in the past and those they 
expect to receive in the future (cf, Gouldner, 1960) 
[p. 281]. 


The other explanation advanced involves the 
hypothesized social responsibility norm. The 


prior help may have heightened awareness of the 
[social] responsibility norm in many of the subjects 
working for a dependent peer. The relatively high 
level of productivity in this condition, then, pře- 
sumably was the result of the increased salience 0 
the socially prescribed obligation to aid others need- 
ing help [pp. 280-281]. 


A further complexity in these results is pointed 


out by the authors; the difference in performante © 


in the groups may have been due, not so much to 
heightened motivation in their prior-help groups, 
but rather to a decreased motivation in their %0- 
prior-help groups. The fact that these subjects 
were not helped may have produced some resent- 
ment that was reflected in the lower performance: 
If, indeed, this was the case, this effect might 
also have come about in either of the two Way 
outlined above. The partner’s failure to offer 
help may have prompted the subjects to “recip 
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rocate" by not helping the other person, or it 
may have acted to “dampen” potential feelings 
of responsibility. 

The present research is aimed at an elabora- 
tion and clarification of the findings of the Berk- 
owitz-Daniels study. Specifically, it seeks to com- 
pare the effects of prior help as mediated by a 
direct reciprocity principle with the effects as 
mediated by an arousal or dampening of the 
awareness of the responsibility norm. In addition, 
attention will be given to the question of whether 
the effect of the prior-help variable is due to a 
heightened motivation in the case where the prior 
help is given, or to a decreased motivation in the 

, case where prior help is withheld. 


METHOD 
Subjects 


The subjects were 84 undergraduate women volun- 
teers from introductory psychology, philosophy, and 
political science courses at the University of Wiscon- 
sin. The subjects from psychology classes were par- 
ticipating for experimental points to be added to 
their course grades. The nonpsychology subjects were 
recruited by the experimenter at the beginning of one 
of their regular class periods. These subjects were 
offered no inducement to sign up other than a chance 
to see “what psycholęgy research is like.” Subjects 
were each contacted by telephone to make arrange- 
ments and to insure that pairs of subjects were not 
previously acquainted. The subjects were assigned 
to the six experimental conditions as they arrived 
in an ABCDEFFEDCBA order, 


Procedure 


Experimental conditions. Two experimental sub- 
jects and two paid participants posing as subjects 
were present at each session. The experimenter ex- 
plained that two separate experiments would be 
conducted. The first experiment was represented as 
Some necessary groundwork for a future industrial 
psychology study. Subjects were told that each per- 
son would be working on a different task, and that 
the tasks required different amounts of time to be 
completed. After these initial instructions, subjects 
were escorted to separate rooms and given three 
sheets, each containing 27 lines of haphazardly or- 
dered letters. The subject’s task was to circle all of 
the w's and to write down the time taken on each 
page. 

Nature of prior help: voluntary, compulsory, or 
Tefused. After 5 minutes, the experimenter entered the 
subject’s room and introduced the first set of experi- 
mental manipulations. All subjects were told that 
the "girl next door" had finished her task. One-third 
of the subjects were told that the girl had volun- 
tarily offered to help the subject with her task 
(voluntary-help treatment). In this treatment, after 
the experimenter left the room the confederate en- 
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tered and appropriately volunteered to work on one 
sheet. Another third of the subjects were told that 
the experimenter would instruct the girl next door 
to come in and work on one of the subject’s sheets 
(compulsory-help treatment). For these subjects the 
confederate came in and took one sheet. For the 
final third of the subjects, the experimenter said that 
the girl might be willing to help the subject. How- 
ever, in this condition the confederate came in and 
pointedly refused to help the subject (help-refused 
treatment). 

The voluntary-help condition and the help-refused 
condition were designed to establish the variable of 
prior help, analogous to that employed in the most 
recent Berkowitz-Daniels study. The compulsory- 
help condition was included as an approximation of 
a control group where help was in fact given, but 
not volunteered. 

When all subjects had completed their tasks, the 
group was reassembled, and the second “separate ex- 
periment” was explained. As in the Berkowitz-Dan- 
iels studies, this phase was represented as a project 
involving the construction of a test of “supervisory 
ability.” There were to be two “randomly selected” 
pairs, with a worker and a supervisor in each pair, 
both working on different problems. The supervisor 
was to write instructions as to how to construct a 
small paper box. The worker’s job was to make boxes 
following the  supervisor's instructions. Another 
problem (the construction of a paper cup) was also 
mentioned, but not actually used, in order to avoid 
possible feelings of competition between the two 
workers. Subjects were told to return to the rooms 
where they had been before, while the experimenter 
“randomly selected" the pairs. When the subjects 
were settled in their respective rooms, the experi- 
menter informed both real subjects that they would 
be workers. 

Working for the same or a different. partner. At 
this point the experimenter introduced the manipu- 
lation for the second variable. One half of the sub- 
jects were told that their supervisor was to be the 
same girl who had previously helped (or not helped) 
them, The rest of the subjects were told that their 
supervisor would not be the same girl but a different 
girl. 

The rest of the procedure was the same for all 
subjects. After 6 minutes the experimenter delivered 
the box-making instructions to the subject. These in- 
structions were, in fact, prepared beforehand and 
were identical for all subjects. After receiving the 
instructions, the subject was given an 8-minute 
“practice period” in order to get a measure of the 
subject’s natural working pace with which the work 
period performance could be compared. 

Dependency relation, After the 8 minutes of prac- 
tice, the subject was informed that her supervisor’s 
rating would depend on the subject’s output during 
the “work period.” The subject was also informed 
that the supervisor was eligible for a $5 prize if she 
got the highest supervisory rating in the box-making 
experiment. In other words, it was clearly implied 
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that the supervisor's chance of winning the prize was 
greatly dependent on the subject's performance. 

The subject was then left alone for 20 minutes to 
construct paper boxes. At the end of this period the 
experimenter returned and gave the subject a ques- 
tionnaire to fill in. Following the questionnaire, the 
real nature of the experiment was explained and 
subjects were sworn to secrecy. 


RESULTS 
Effectiveness of Experimental Manipulations 


The postexperimental questionnaire included 
several questions designed to check the effective- 
ness of the experimental manipulations, Analysis 
of responses to these questions indicated that 
both the prior-help and the identity-of-supervisor 
variables were successfully established. 

Prior help. Responses to the question, “Were 
you helped at all during the first ‘experiment’ 
... P" showed that all the subjects in the re- 
fused-help groups said that they had not received 
help and that all but two subjects in the volun- 
tary-help and compulsory-help groups said that 
they had received help. 

The responses to the question, “Was her help 
offered voluntarily?" showed that subjects in the 
voluntary-help conditions saw the help that they 
received as being offered more voluntarily than 
did the subjects in the compulsory-help condi- 
tions. The high correlation between being in the 
voluntary-help condition and reporting that the 
help was given voluntarily (phi —.86) clearly 
indicates the effectiveness of the kelp manipula- 
tion. 

Identity of supervisor. All subjects included in 
the analysis were able to identify their supervisor 
correctly as being either "the same girl that came 
into your room while you were working on the 
first experiment” or "the other girl." ? 

In summary, subjects receiving the voluntary- 
help treatment reported that the assistance that 
they received was given voluntarily, subjects re- 
ceiving the compulsory-help treatment saw their 
help as much less voluntary, and those receiving 
the refused-help treatment reported that they 
were not helped at all. Subjects in the same 
supervisor conditions identified their supervisor 
as the same girl that they had encountered dur- 
ing the prior-help manipulation, and subjects in 
the different supervisor conditions identified their 
supervisor as someone other than the girl they 
had encountered in the prior-help manipulation. 


20ne subject in the compulsory-same condition 
was unable to identify her supervisor correctly and 
for this reason none of her scores were included in 


the analysis. 
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Performance on the Experimental Task 


The main dependent variable employed in 
analysis was the increase in the rate of produc- 
tion of boxes from the practice period to the 
work period. Since the work period was 2.5 times 
as long as the practice period, the total practice- 
period production was multiplied by 2.5, yielding 
a production rate score comparable for both 
periods. A correlation of .69 was obtained be- 
tween these measures. The rate-of-production- 
increase score was then obtained by subtracting 
the practice-period rate from the work-period 
rate for each subject. The use of this rate-of- 
production-increase score allowed for a measure 
of control for the variability due to persistent | 
individual differences in ability and motivation. 

In addition to the prior-help and identity-of- 
supervisor variables, one additional variable was 
included in the analysis of the performance 
scores. This was the identity of the confederate 
who was acting as the subject's "supervisor." 
Since only two confederates were employed 
throughout the experiment, this factor was varied 
so that half of the subjects worked for one girl, 
half for the other. The summary table for the 
analysis of variance of the increase-in-production- 
rate scores is given in Table 1. 

A Duncan multiple range test of the three 
prior-help condition means showed that the ex- 
pected relation was obtained; the mean of the 
voluntary conditions was greater than the mean 
of the compulsory conditions ( < .05), and the 
mean of the compulsory conditions was in turn 
larger than the mean of the refused conditions 
(p < .05). 

The significant Help X Supervisor interaction, 
however, indicates that the effects of the prior 
help did not operate independently of the iden- 
tity of the supervisor, In order to investigate 


TABLE 1 


SUMMARY TABLE FOR THE ANALYSIS OF THE 
RATE-OF-PRODUCTION-INCREASE SCORES 


Source df 


Supervisor (same, or different) 


) 1| 52.65] 45% 

re ius (B) 1 24 
Help (voluntary, compulsory, 

refused) (C) ` : 2 |132.94 | 11.60 
AX B! 1 | 23.57| 2.06 
AXC 2 | 75.58| 6.62 
BXC 2| 139 
AXBXC 2| 551 
Error 72 | 1146 

*p = 05. 

- = 01, 


* 
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TABLE 2 
Means or Main EXPERIMENTAL CONDITIONS FOR 


i THE RATE-Or-PRODUCTION-INCREASE SCORES 


$ 
i Help 
Supervisor 
Voluntary Compulsory Refused 
| Same 16.57, 12.78, 8.93. 
—— Different 11.705, 1125, 10.61. 


—— — Note.—Means with commor. subscript are not significantly 
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different (at .05 level) by Duncan multiple range test. 


this interaction further, a Duncan multiple range 
test was performed on the means of the six main 
experimental groups. The condition means and 
test results are presented in Table 2. In the 
three same conditions the ordering of the means 
from highest to lowest is voluntary-compulsory- 
refused, and the differences between condition 
means are all significant at the .01 level. In other 
words, the voluntary-help treatment increased the 
subjects’ effort in working for the person who 
had helped them earlier, and the refused-help 
treatment reduced the effort put out on behalf of 
the person who had previously refused them help. 
"That the prior-help manipulation had any influ- 
ence at all in the three different supervisor con- 
ditions is suggested “only by the fact that the 
means for these conditions are similarly ordered 
although they do not differ significantly from 
one another. 


Questionnaire Data 


A 2X2x3 (Supervisor X Confederate X 
Help) analysis of variance was performed on the 
postexperimental questionnaire items, and the sig- 
nificance of differences among the means of the 
six main experimental conditions (Supervisor X 
Help) was evaluated by the Duncan range test. 

* Social norms are most often defined in terms 
of the perceived expectations of others. A “social 
responsibility norm" would involve an individu- 
al’s belief that others expect him to help those 
who are highly dependent on him, even without 
the promise of any material reward. Following 
this reasoning, the question, “To what extent 
would most people have expected you to work 
hard to help your supervisor win the prize?” was 
included in the questionnaire as an index of the 
subjects’ awareness of this kind of expectation— 
that is to say, as a measure of the salience of a 
“social responsibility norm.” 

Th the same supervisor condition a high score 
On this item probably indicates awareness of a 
reciprocity norm; the item here reflects the de- 
gree to which the subject believes most people 
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would expect her to repay the individual she 
had encountered earlier. As Table 3 shows, the 
perceived expectation to work hard in the same 
condition is reliably stronger when the subject 
had received voluntary help than when the help 
given the subject had been required. The women 
felt they were particularly expected to recipro- 
cate for favors done voluntarily for them. Inter- 
estingly, those girls who had been refused help 
did not seem to believe that other people would 
expect them to retaliate in kind. Like the sub- 
jects in the compulsory-help group, they thought 
there was a moderate expectation that they 
should aid the person who was dependent on 
them, even though she had refused to help them 
earlier. 

The scores in the different supervisor condi- 
tions suggest that the responsibility norm is 
somewltat weaker than the norm prescribing reci- 
procity. More important, the subjects who had 
received the voluntary-help were not reliably 
more aware of an expectation to help this dif- 
ferent person than were the subjects in the two 
other different-supervisor groups. It may be that 
the subjects in all of the groups were at least 
moderately aware of the responsibility norm be- 
cause the supervisor’s dependency on them was 
very clear and easily grasped. The differences 
shown in the different-supervisor condition might 
have been significant if, for one reason or an- 
other, the situation were somewhat more ambigu- 
ous and/or the dependency relationship weaker. 

That subjects in the refused conditions could 
have become resentful because they felt they 
had been treated badly in the first part of the 
experiment is indicated by responses to the item 
designed to measure the subjects’ liking for their 
supervisors: “Would you want your supervisor as 
a roommate... ?” A comparison of the six 
condition means shown in Table 4 indicates that 
it is the refused-same condition which differs 


TABLE 3 


MEANS or MAIN EXPERIMENTAL CONDITIONS FOR THE 
Question, "To WHat Extent Woutp Most 
PEOPLE HAVE ExPECTED You TO WORK 
Harp To HELP Your SUPERVISOR 
WIN THE Prize?” 


Help 
Supervisor 
Voluntary Compulsory Refused 
Same 10.21, 8.14, 8.575 
Different 9.144, 8.86.5 7.57, 


Note.—Means with common subscript are not significantly 
different (at .05 level) by Duncan multiple range test. A high 
score indicates that subjects felt very much that most people 
would have expected them to work hard. 
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TABLE 4 


Means or Main EXPERIMENTAL CONDITIONS FOR THE 
Question, “ON THE Basis OF WHATEVER IM- 
PRESSIONS You Micut Have, Wourp You 
Want Your SUPERVISOR AS A Room- 

MATE (AssumiING You WERE LOOK- 

ING FOR A ROOMMATE)?” 


Help 
Supervisor 
Voluntary Compulsory Refused 
Same 4.71, 5.07, 1.19. 
Different 5.86, 4.93, 5.36, 


Note.—Means with common subscript are not significantly 
different (at .05 level) by Duncan multiple range test. A high 
score indicates a low liking. 


from all the others at the .05 level of signifi- 
cance. The refused-help treatment probably con- 
stituted a frustrating experience, producing in 
turn a hostility toward the supervisor when it 
was this person who was the source of the frus- 
tration. 

Discussion 


Two different, although perhaps related, norma- 
tive expectations may have to be advanced to 
explain the present results: a norm prescribing 
that a dependent individual should be helped 
(the responsibility norm), and a social standard 
calling for the repayment of benefits received 
from others (the reciprocity norm). 

As discussed earlier, the reciprocity norm has 
received considerable attention recently. The 
findings for the same supervisor groups represent 
a direct contribution to these formulations in 
that they point to one of the conditions on which 
an individual’s feelings of obligation to make 
repayments may be contingent. The college 
women in this sample evidently regarded them- 
selves as obligated to reciprocate for the help 
they had received primarily when this help had 
been given voluntarily. Thus, when they were 
working for the person who had helped them 
earlier, they worked harder and were more defi- 
nite in stating that other people would expect 
them to work hard, when the earlier assistance 
had been given voluntarily rather than when it 
had been either required or had been refused. 

The conjectured responsibility norm could well 
have been operating, to some extent at least, in 
the same-supervisor condition as well as when 
subjects were working for a different supervisor. 
This is indicated in a number of ways. For one 
thing, the subjects in the same-supervisor treat- 
ment who had been refused help earlier exhibited 

lower motivation to assist their supervisor and 
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expressed less liking for her than did the girls in 
either of the other two same-supervisor groups. 
They were evidently annoyed at not being helped 
earlier. It may be that they believed the other 
person should have aided them; the societal 
standard had called for giving help and this norm 
had been violated. It is, of course, also quite 
possible that subjects interpreted the refusal to 
aid them as a personal rebuff or insult. 

In the differerit-supervisor condition the rela- 
tively high rate of productivity and some of the 
questionnaire responses point to the existence of 
the responsibility norm. All three groups in this 
treatment showed a considerable gain in produc- 
tivity, perhaps largely because they believed 
this different person was dependent upon them. 
Conformity to this presumed norm could then 
have minimized the group differences in this 
condition. All three groups also reported a mod- 
erately strong expectation prescribing that they 
should. aid their supervisor, again suggesting that 
most of the subjects were at least somewhat 
aware that they should behave in a responsible 
manner. 

The reciprocity behavior seen in the present 
study probably should not be considered simply 
as an instance of “rational” or economic inter- 
change. First, the work was arranged so as to 
minimize, for all subjects, the anticipation of any 
material return, Second, it was emphasized to the 
subjects that their "partnership" with their 
supervisor was to terminate with the end of the 
work period—in fact, subjects were led to be- 
lieve they would not ever see their supervisors 
again. Since the relationship was not to continue, 
subjects had no rational motivation to repay their 
partners in order to “keep their credit good." 
Finally, as was pointed out earlier, the special 
influence of the perceived "voluntariness" of the 
prior help does not fit in well with the economic 
model. 1 
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Given a fundamentally incongruous situation in which a favorable source 
makes a strong negative assertion about a valued concept, 4 different strategies 
are suggested by the principle of congruity to reduce the potential negative 
attitude change toward the concept. Their effectiveness was experimentally 
investigated under both preattack (immunization) and postattack (restoration) 
conditions. Refuting the attack, derogating the source, and prior strengthening 
of the concept reduced persuasion significantly, but the source's denying making 
the assertion did not lead to significantly less change than an attack-only 
condition. There was a general superiority of the immunization over the 


restoration sequence. 


Recent developments in attitude theory have 
some grave implications for the vulnerability of 
individuals to persuasive manipulations. Cohen 
(1960) found such Orwellian implications in his 
recent review of attitudinal consequences of 
cognitive-behavior inconsistencies. But he also 
saw a possible silver lining: “On the positive 
side, let us hope that any principles derived . . . 
are equally applicable to devloping resistance to 
persuasive inducements [p. 318]." Apart from 
the value judgments involved, his point is well 
taken—a theory of attitude change should also 
be able to prescribe conditions under which 
attitudes will not change. 

Such a consideration has prompted several at- 
tempts to investigate the effects of various pre- 
treatments in diminishing the impact of a subse- 
quent persuasive event. Janis, Lumsdaine, and 
Gladstone (1951) studied a contradictory pre- 
paratory communication, and attributed its re- 
sistance effects to the proactive inhibition it pre- 
sumably elicited. Dissonance theory (Festinger, 
1957) has been used to explain the apparent 
resistance conferred by a forewarning of a belief 
attack (cf. Allyn & Festinger, 1961; Kiesler & 
Kiesler, 1964). McGuire (1964) has developed a 
comprehensive theory of “inoculation against 
persuasion"—somewhat analogous to biological 
immunization—to account for the defense of 
“noncontroversial cultural truism” beliefs. He has 
marshaled some impressive evidence to support 
his contention that prior exposure to weakened 
forms of the attack may serve to stimulate, but 


1 This research was supported in part under Grant 
G-23963 from the National Science Foundation, and 
in part by a grant from the Graduate Research 
Committee at Wisconsin from funds supplied by the 
Wisconsin Alumni Research Foundation, both to the 
Senior author. 


not overwhelm, defense of the initial belief (cf. 
McGuire, 1961; McGuire & Papageorgis, 1961; 
Papageorgis & McGuire, 1961). 

The principle of congruity (Osgood, Suci, & 
Tannenbaum, 1957; Osgood & ‘Tannenbaum, 
1955) represents a somewhat different approach 
to the study of attitude formation and change, 
and thus provides a different theoretical rationale 
for investigating resistance to persuasion, The 
present paper is concerned with assessing several 
persuasion-reducing strategies suggested by the 
theory. 


Congruity Principal Strategies 


Let us assume a simple communication situa- 
tion in which a favorable source (S+) makes a 
strong negative assertion (A—) against a gener- 
ally favorable concept (C+). Under such condi- 
tions, the cognitive elements involved are in an 
incongruous relationship, and the theory predicts, 
among other things, a negative shift in attitude 
toward C. The problem posed is to eliminate 
or reduce this negative change in the C attitude. 

Application of the principle of congruity to 
this problem stems from its basic postulate that 
the existence of incongruity produces a pressure 
toward change. Hence, reducing the degree of in- 
congruity should serve to reduce the degree of 
attitude change. A number of possible procedures 
or strategies for rendering the situation more 
congruous are indicated: 


1. Severing the cognitive link, Basic to the 
congruity principle is the notion that the issue of 
incongruity arises only when the cognitive ob- 
jects involved are brought into association with 
one another (Osgood & Tannenbaum, 1955, p. 
43). Thus, in the present situation, if the par- 
ticular S and C can somehow be dissociated, 
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the degree of incongruity should be reduced 
accordingly. 

2. Changing attitude toward source. Given a 
negative assertion toward a favorable concept, a 
more congruous situation would obtain if the 
source were negatively evaluated. Accordingly, 
the indicated strategy is the attack and deroga- 
tion of the identified source. 

3. Invalidating the assertion. The weaker the 
assertion attacking the concept, the less effec- 
tive it should be in producing a negative attitude 
change. Thus, if the subject is manipulated so as 
to question the validity of the attack arguments, 
the desired result would obtain. Furthermore, if 
such a manipulation were to totally invalidate 
and even reverse the attack—for example, 
through a specific, point-by-point refutation— 
the situation would be rendered still less 
incongruous. z 

4. Strengthening attitude toward the concept. 
Another attitude-change principle incorporated 
within congruity theory is that more intense 
attitudes are more resistant to change (cf. 
Tannenbaum, 1956). Thus, if the initial C atti- 
tude can be boosted and made even more favor- 
able, it should be less susceptible to the main 
attack. 


The purpose of the present research was to 
make manifest each of these four congruity 
principle strategies,? and investigate their respec- 
tive persuasion-reduction capacities both as pre- 
attack (immunization) and postattack (restora- 
tion) treatments. In addition, one or another of 
these strategies was replicated in subsequent 
studies, and such supplemental findings are 
included, 


METHOD 
Belief Issues 


For reasons of convenience and to allow for pos- 
sible comparison, the issues were the same health 
practices used by McGuire. Three such topics were 
included: frequent toothbrushing as a decay-preven- 
tive practice, regular medical checkups even in the 
absence of specific illness, and the use of X rays for 
the detection of tuberculosis. 

An attack on each of these practices was prepared 


2 While the four strategies are here derived directly 
from the principle of congruity, they are not unique 
to it. The basic rationale derives as well from such 
other consistency models of attitude change as dis- 
sonance (Festinger, 1957) or balance (Rosenberg & 
Abelson, 1960) theory, and some of the indicated 
procedures are also suggested by other individual 
work—for example, the source attack strategy 
by Hovland and Weiss’ (1951) study of source 


credibility. 
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in the form of a copy of an "official statement” 
from the United States Public Health Service 
(USPHS). The message clearly identified USPHS as 
the source of the statement and proceeded to state 
generally and then argue specifically four separate 
points against the particular health practice. The 
actual messages were adapted from McGuire’s ma- 
terials (cf. McGuire & Papageorgis, 1961), the lan- 
guage and style being changed somewhat to better 
conform to the apparent message format.’ 


* 


Experimental Treatments 


Special messages were prepared to reflect the four 
different strategies suggested by the congruity prin- 
ciple. These took somewhat different forms for the 
different treatments. 

Denial, The strategy of severing the cognitive link 
between the source and concept took the form of 
a USPHS press release, denying any connection with 
"recent statements which have been erroneously 
attributed" to USPHS. The message stated that 
USPHS neither agrees nor disagrees" with the 
alleged statements, but that it wished to "correct the 
mistaken impression that the recommendations 
regarding (the particular health practice) were 
authorized by the Service." 

Source attack. The strategy of derogating the 
source took the form of an Associated Press news 
story of a report by “a special investigating com- 
mittee of medical researchers and practitioners" 
blasting USPHS as “incompetently staffed, riddled 
with political appointees, and generally not serving 
the public interest.” The story quoted extensively 
from the ostensible committee report, which gave 
USPHS a thorough raking, including criticism of 
“totally unwarranted statements about various 
phases of public health.” 

Refutation. The strategy of invalidating the asser- 
tion was in the form of a detailed refutation of the 
attack message by a special committee of either the 
American Medical Association or the American 
Dental Association, depending on the specific issue. 
In each case, the message stated the specific counter- 
argument against the belief and offered a point-by- 
point rebuttal. It did not specifically identify the 
main attack nor mention the USPHS, but referred to 
“recent stories and statements to the public” arguing 
against the given health practice. These refutation 
messages, too, were adapted from McGuire’s test 
materials, 

Concept boost. The strategy of bolstering the 
concept attitude took a form similar to that of the 
refutation treatment. It was also identified as part 
of a statement from a special committee of the 
particular professional association, But it merely 
offered supportive evidence for the particular health 


3 Copies of the actual attack messages plus other 
relevant experimental materials, including the various 
experimental conditions, instructions, etc., may be 
obtained by writing directly to the Mass Communi- 
cations Research Center, University of Wisconsin, 
Madison 53706. 
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practice, without reference to any counterarguments, 
Here as well, McGuire’s “supportive” treatment 
materials provided the basis for our statements. 


Belief Measures 


The main dependent variable of belief change was 
assessed in the manner used by McGuire (cf. Mc- 
Guire & Papageorgis, 1961). Three separate state- 
ments—two expressing a positive aspect, the third a 
negative one, of the given practice—were used for 
each health topic. Degree of agreement with each 
statement was indicated on a 15-point scale, com- 
posed of five main categories ranging from extreme 
agreement to extreme disagreement, with each such 
category further divided into three subcategories. 
The belief score was indexed by the sum of the 


_tatings across all three items, after adjustment for 


consistency of direction. 


General Procedure 


Male and female students in an undergraduate 
psychology of adjustment course at the University 
of Wisconsin served as subjects for the main experi- 
ment. Belief measures on all three topics were first 
obtained from all subjects during a regular lecture 
session (T,). Separate quiz sections of the course 
then represented the different experimental condi- 
tions, each quiz section serving in only one condition 
and on only a single health topic. 

The four treatments were each presented as pre- 
attack, or immunizatiun, conditions. Approximately 
3 weeks (spanning the spring-vacation period) after 
the initial testing, subjects in these 12 groups—four 
treatments on each of three topics—were exposed to 
their respective immunization message, and their 
belief again measured (Ts). One week later, again 
during the quiz-section meeting, these subjects were 
exposed to the attack messages, and belief was again 
assessed (Ts). 

Each treatment was also used as postattack, or 
restoration, conditions, However, because of a major 
clerical error, the data for the concept boost groups 
Were not properly obtained. The various test ma- 
terials for these nine groups (three treatments on 
each of three topics) were the same as for the im- 
munization groups, but the testing procedure was 
reversed. That is, at the Ts session, they were 
exposed to the attack message and received their 
Tespective experimental treatments at Ts. 

Separate groups who received only the attack 
between two successive testings, with no other treat- 
ments, were used as the attack-only condition. 
Similarly, subjects who were merely tested on sepa- 
Tate occasions without any intervening exposures at 
all served to represent a control condition. 


RESULTS AND DISCUSSION 


j Examination of the T, data revealed substan- 
tial variation among the different experimental 
Broups—possibly because, rather than randomly 
assigned subjects, existing intact quiz sections 
Were used. Since subsequent change may in 
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TABLE 1 


ADJUSTED BELIEF MEANS FOR DIFFERENT EXPERI- 
MENTAL AND CONTROL CONDITIONS 


Treatment 
Condition Denial | Source | Refuta- || Boost 
attack tion 

(N =16) | (N =16) | (N =25) | (N =11) 
Controle 11.77, | 11.86, | 11.70, | 11.22, 
Immunization 9.06, | 10.62, | 12.46, 10.85, 

Restoration 9.96, 8.54, | 10.45, — 
Attack only* 8.56, 8.30, 8.82, 8.39, 


Note,—Means showing the same subscript are not signifi- 
cantly different at the ,05 level. This applies only to compari- 
sons within a given column, and are not across conditions. 
ae a 15-point scale; higher scores indicate more favorable 

iefs. 

* The different control and attack-only means in the different 
columns result from the necessity for equal-sized groups within 
a given covariance ysis. The respective means represent 
different random selections of subgroups. 


part be a function of relative initial position 
(Hovland, Harvey, & Sherif, 1957; Tannenbaum, 
1956), analysis of covariance of the Tg data, 
covarying for T, scores, was employed. Con- 
siderable attrition of subjects resulted from such 
a procedure: only those experimental subjects 
who were present at all test sessions could be 
used, and further reduction was necessitated by 
the equal N requirement of the covariance 
computer program.* 

The basic data are presented in Table 1, which 
includes the adjusted means for each of the 
experimental conditions, along with the means 
for the corresponding attack-only and control 
groups. In all cases, these are computed across 
the three topics, previous analysis having re- 
vealed no between-topic differences, Table 1 also 
presents the results of the comparison between 
means within a given strategy by a Duncan 
(1955) range test, as modified by Kramer (1957) 
for covariance analysis. This was our main focus 
of interest—to compare each experimental treat- 
ment relative to the attack-only and control con- 
ditions. Comparison between different strategies 
js not particularly relevant here. 


Denial Treatment 


No support was found for the effectiveness of 
this strategy, either in the immunization (DA) 
or restoration (AD) conditions. The respective 
means are slightly but not significantly higher 
than in the attack-only (A) condition, and all 
three are significantly lower than the control (O) 
group. 

The same four conditions were included in a 


* Use of the CDC 1604 computer at the Wisconsin 
Computing Center facilitated the various analyses. 
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quite separate study which had a somewhat 
different methodology: Only the X-ray topic was 
used; the source was identified as a professor of 
medicine; and, most important, the source not 
only denied making the assertion but also stated 
his strong support of the health practice. The 
same finding obtained, however: all three experi- 
mental treatments (A — 7.21, DA— 8.77, AD— 
8.60) changed significantly more than the O 
group (12.52), with no differences among the 
three. 

These results do not necessarily mean that the 
fundamental strategy is invalid for the reduction 
of persuasion, It may well be that the particular 
denial messages employed did not sufficiently 
manifest the indicated strategy. However, it can- 
not be ascertained that this was the case here, 
since there was no available means of assessing 
whether the intended dissociation between source 
and concept really took place. 


Source-Attack Treatment 


This treatment conferred a significant degree 
of resistance in the immunization (SA) but not 
in the restoration (AS) condition. The SA group 
changed significantly less than the A group and, 
moreover, did not differ significantly from the O 
group. The AS group, on the other hand, showed 
virtually no resistance, 

Since an independent check (on a separate 
rating of USPHS on a 15-point scale) did not 
reveal a major downward shift in source rating 
in the SA and AS groups, replications of this 
strategy in other studies used a different source 
(generally an identified professor of medicine) 
and somewhat stronger source attacks. In four 
such replications of the SA condition, its im- 
munizing efficacy was confirmed, In particular, 
one study which used teen-aged summer-camp 
students as subjects, with all messages delivered 
oraly, found the source attack to be highly 
successful in downgrading the source, and even 
more clear-cut differences between the 0 (13.26), 
SA (10.43), and A (5.19) conditions. In the one 
additional study which included the AS condi- 
tion, some improvement in its relative efficacy is 
apparent—AS (6.60) was still not quite signifi- 
cantly higher than A (5.96), but also was not 
significantly lower than SA (7.88). 

These results, then, support the indicated 
strategy for reduction of persuasion. As such, 
they reflect the findings of Allyn and Festinger 
(1961) showing that when a forewarning of at- 

tack was given the source was judged more biased 
than when no forewarning was presented. The 
present studies, involving independent manipu- 
lation of the source attitude, make this point 
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more clearly, however. The fact that the AS 
condition is a somewhat poorer persuasion re- 
ducer than SA is consistent with the findings of 
Greenberg and Tannenbaum (1961) in which less 


attitude change in the advocated direction oc- s 


curred when the source identification appeared at 
the end, rather than at the beginning or in the 
middle, of the message. 


Refutation Treatment 


The group difference$ on this strategy are 
most clear-cut. Both the immunization (RA) 
and restoration (AR) conditions conferred re- 
sistance, significantly more so in the former. In- 
deed, the RA group showed a slight, albeit not 
significant, strengthening over the O group. 

Further studies in this program of research 
have included five replications of the RA condi- 
tion and three of the AR. In all instances, both 
produced significant degrees of reduction of per- 
suasion. In all three additional cases of com- 
parison, RA was somewhat but not significantly 
superior to AR. 

These findings are largely consistent with those 
of McGuire (1961; McGuire & Papageorgis, 
1961). It is apparent, however, that while the 
inoculation and congruity theories agree that the 
refutation treatment is an»appropriate strategy 
for reducing persuasion, they posulate somewhat 
different mechanisms through which it confers 
resistance—the former proposing a  defense- 
arousal function within the individual, while con- 
gruity theory suggests that it weakens and pos- 
sibly reverses the attack message, per se. De- 
tailed comparison between the two theories on 
this point is beyond the scope of the present 
paper, but is an obvious focus for further re- 
search. 


Concept-Boost Treatment 
This treatment, available only in the preattack 


(BA) condition, demonstrated a significant degree. 


of immunization against the attack, Since belief 
measures were available at T.—that is, directly 
after the boost message, but before the attack— 
it was possible to ascertain whether the boost 
actually strengthened the belief, as anticipated. 
Actually, only a slight increase was noted— 
hardly enough to account for the apparent degree 
of immunization, 

In an additional study, however, a significant 
(p < .001) strengthening of belief (from 11.91 to 
13.29) was noted after the boost message. This 
replication also had the virtue of larger-sized 
groups (NW = 30), and yielded similar findings 
(BA = 10.01, A=8.16, p<.05). McGuire 
(1961), too, reports a significant increase in 
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belief after exposure to the supportive message. 
It is also likely that other aspects of belief sys- 
tems—for example, one or more of Guttman’s 
(1954) attitude “components”—may be involved 
in strengthening of belief, making it more re- 
sistant to change. 


Comparison across Conditions 


We have already indicated: that comparison 
between the suggested strategies is of little or no 
interest to us here. Examination of the results 
does indicate that the refutation and boost treat- 
ments are somewhat superior to the source at- 
tack and denial procedures—suggesting, inci- 
dentally, a more information-based rather than a 
purely psychological (cf. Abelson & Rosenberg, 
1958; Osgood, 1960) mechanism. But any such 
comparison may be contaminated by differences 
other than those of pure strategy—for example, 
messages of different length, different format, 
emanating from somewhat different sources, etc. 
Under such circumstances, it is impossible to 
attribute the observed differences in the strate- 
gies, as such, 

A direct comparison between the immunization 
and restoration conditions, however, is possible, 
since here, within a given treatment, the various 
messages are identical and only their order of 
exposure was varied. The data in Table 1 indicate 
a significant superiority for the immunization 
condition in the source attack and refutation 
treatments, but a reversal, though not a signifi- 
cant one, in the denial treatment, These data are 
Dot appropriate for a comparison across the 
three treatments combined, However, an analvsis 
of variance of the unadjusted T; scores did show 
less overall change for the immunization (10.54) 
Versus restoration (9.64) condition—not quite 
the 16:1 ratio implied by "an ounce of preven- 
tion equals a pound of cure,” but significant 
nevertheless (F = 4.60, df = 1/143, p< .05). 

This may represent a special case of a basic 
Drimacy versus recency preference (cf. Hovland, 
Mandell, Campbell, Brock, Luchins, Cohen, Mc- 
Guire, Janis, Feierabend, & Anderson, 1957). But 
it may also represent something more funda- 
mental for the reduction of persuasion, in that 
an attempt to restore the belief after the attack 
is akin to “locking the barn door after the horse 
has been stolen." If the attack is successful—and 
all the evidence suggests that it is—then any 
restorative measure creates an incongruous situ- 
ation since it is in conflict with the now-negative 
attitude toward the concept. Some change back 
toward a more positive attitude may result from 
this new incongruity. But as Kiesler and Kiesler 
(1964) have pointed out in comparing a “fore- 
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warning” with an “after-warning,” another way 
out of the dilemma is for the subject to reject 
the latter (ie. the restorative treatment) rather 
than the attack message. The various immuniza- 
tion conditions, on the other hand, introduce 
little or no additional conflict in themselves. But 
they do serve to blunt the subsequent attack, 
thus reducing the potential incongruity it may 
introduce. 
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EFFECTS OF LEADERSHIP STYLE UPON GROUP PERFORMANCE 
AS A FUNCTION OF TASK STRUCTURE * 


MARVIN E. SHAW anp J. MICHAEL BLUM 2 
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Fiedler’s contingency model holds that directive leadership is more cffective 
when the group-task situation is either highly favorable or highly unfavorable 
for the leader, whereas nondirective leadership is more effective in the inter- 
mediate ranges of favorability. An experiment was conducted to test the 
generality of this hypothesis, 5-person groups attempted 3 tasks under either 
directive or nondirective leadership. Leadership behavior was manipulated by 
instructions. The 3 tasks were selected to vary along the solution multiplicity 
dimension, hence presumed to reflect different levels of favorability for the 
leader. The results indicated that the directive leader was more effective than the 
nondirective leader only when the group-task situation was highly favorable 
for the leader, thus only partially supporting the hypothesis. 


Studies of the effects of leadership style 
(autocratic versus democratic, directive versus 
nondirective, etc.) upon group effectiveness have 
yielded ambiguous and often contradictory re- 
sults (Anderson & Fiedler, 1964; Fiedler, 1958; 
Lewin, Lippitt, & White, 1939; Preston & Heintz, 
1949; Shaw, 1955). A theory proposed recently 
by Fiedler (1964) suggests a possible explanation 
of these inconsistent findings. His basic thesis 
is that the type of leadership behavior required 
for effective group performance is contingent 
upon the favorableness of the group-task situa- 


1This research was supported by the Office of 
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tion for the leader, where favorableness refers 
to the degree to which the group environment 
makes it easy or difficult for the leader to in- 
fluence group members, When the group-task 
situation is either highly favorable or highly 
unfavorable for the leader, controlling, managing, 
directive leadership behavior is most effective, 
whereas permissive, considerate, nondirective 
leadership is needed for moderately unfavorable 
group-task situations. 

According to this theory, the favorableness 
of the group-task situation is determined by three 
dimensions: the affective relation between the 
leader and his members, the power inherent in 
the leadership position, and the degree to which 
the task is structured. Although it is recognize 
that the interaction of these dimensions is com- 
plicated, Fiedler suggests that the leaders rela- 
tions with his members is the most important, 
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the structure of the task is next most important, 
and inherent power of the leadership position is 
least important for the favorableness continuum. 
Once measures of the three dimensions are avail- 
able, it is possible to order group-task situations 
along the favorableness continuum, by first order- 
ing the group-task situation on the basis of the 
leader’s relation with his group, then on the basis 
of task structure, and lastly on the basis of 
position power. This ordering thay be considered 
to be an operational definition of the favorability 
continuum, The most favorable group-task situa- 
tion is one in which leader-member relations 
are good, the task is highly unstructured and the 
position power is strong; the most unfavorable 
Eroup-task situation is one in which leader- 
member relations are very poor, the task is 
unstructured, and the position power is weak. 

In support of his theory, Fiedler reexamined 

the findings of several studies of leadership, in- 
volving 21 different kinds of groups. In these 
Studies leader-member relations had been mea- 
sured by a variety of self-report procedures 
(sociometric choice, leader ratings of group 
atmosphere, etc.). In Fiedler's reexamination, 
task structure was operationally defined in terms 
of four task dimensions (decision verifiability, 
goal clarity, goal path multiplicity, and solution 
specificity) suggested by Shaw (1963), and in- 
herent power of the leadership position was 
tentatively defined by a check-list rating of the 
leader’s position, Using these measures, a favor- 
ability continuum was determined by the order- 
ing procedure described above. Median correla- 
tions between least preferred co-worker (LPC) 
Scores and group effectiveness scores were then 
telated to this favorableness continuum. This 
yielded a U shaped curve showing that leaders 
with low LPC scores (directive leaders) were 
More effective at both extremes of the con- 
tinuum, whereas leaders with high LPC scores 
(nondirective leaders) were more effective in the 
middle range of the favorableness continuum. 
. This support for the theory is impressive, but 
1S open to the objection that since the interpre- 
tation is ex post facto it may not stand up under 
cross-validation, Furthermore, leadership behav- 
lor was inferred from personality scores, a 
hazardous procedure at best. The purpose of the 
Present study is to test the generality of Fiedler's 
theory by experimentally manipulating both the 
Broup-task favorability dimension and the be- 
havior of the leader. 


METHOD 


s The. experimental design was a mixed factorial, 
Volving two styles of leadership (directive and 
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nondirective), and three degrees of group-task 
favorability (high, moderate, and low). 


Subjects 


The subjects for this experiment were 90 male 
undergraduates at the University of Florida. They 
were randomly assigned to 18 five-member groups. 
Nine groups worked under directive leadership and 
9 under nondirective leadership. 


Tasks 


Each group attempted the same three tasks, which 
were chosen to vary the favorability dimension, as 
described below. Task A was a discussion task 
adapted from Cleveland and Fisher (1957), It re- 
quired the group to list the five most important 
traits that a person needs for success in our culture. 
Task B was a discusison task taken from a report 
by Bass (1960) involving the case of a young 
politician who is burdened with an alcoholic wife. 
The group was asked to decide which of five pos- 
sible courses of action would be the best one. Task C 
called for the group to identify five objects 
(“wrench,” “ruby,” etc.) by asking questions (Smith, 
1957). The subjects were given a clue that the object 
was either vegetable, animal, or mineral, and then 
permitted 40 questions to identify it, These tasks 
have been described fully in an earlier report (Shaw, 
1963) 3 


Experimental Manipulations 


Leadership style. Type of leadership behavior was 
manipulated by instructions to the assigned leader. 
One member of each group was briefed in private by 
the experimenter’s assistant. He was then directed to 
report to the experimental room in the same manner 
as the other subjects. Half of the leaders were 
instructed to behave in a controlling, directive man- 
ner and the other half were instructed to be permis- 
sive and nondirective in their behavior toward other 
group members. 

Group-task favorability. As Fiedler (1964) noted, 
it is difficult to obtain a laboratory situation in 
which affective leader-member relations are poor, 
and when a leader is assigned to the group his power 
position is strong. Hence, the group-task favorabil- 
ity variable was manipulated by means of task 
structure only. The three tasks were selected to vary 
on the solution multiplicity dimension, based upon 
the scale values determined by Shaw (1963); scale 
values on other task dimensions were essentially 


the same. 


3 Copies of tasks, general instructions, and special 
instructions to leaders have been deposited with the 
American Documentation Institute. Order Document 
No. 8605 from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress, 
Washington, D. C. 20540. Remit in advance $1.25 
for microfilm or $1.25 for photocopies and make 
checks payable to: Chief, Photoduplication Service, 
Library of Congress, 


240 


From this manipulation, it is clear that the group- 
task situation was most favorable for Task C 
(leader-member relations good, position power 
strong, task structured), most unfavorable for Task 
A (leader-member relations good, position power 
strong, task unstructured), and intermediate for 
Task B (leader-member relations good, position 
power strong, task moderately structured). Thus, 
directive leadership should be more effective for 
Tasks A and C, and nondirective leadership for 
Task B. 


Procedure 


The five subjects in each group were seated around 
an oval work table. Each person was given a copy 
of the instructions so he could follow the experi- 
menter’s verbal presentation. The instructions indi- 
cated that the group would be asked to solve a 
number of tasks working together as a group, and 
that Person A (the instructed subject) was appointed 
leader to facilitate interaction. The instructions fur- 
ther stated that the group must follow the leader's 
directions. 

The order in which the three tasks were presented 
to the groups varied according to a systematic Latin 
square, such that each task was attempted first, 
second, and third an equal number of times. After 
each task was explained to the group, 5 minutes were 
given for planning the method of attack. Records 
were kept of time required for completion and of 
final solutions for each of the tasks, After all three 
tasks had been completed, each subject responded 
to a questionnaire which called for ratings of satis- 
faction with the group, group cooperation, group 
performance, leadership performance, and directive- 
ness of the leader’s behavior. 


RESULTS 


Before examining the results, it is necessary 
to inquire whether the leader’s behavior was ma- 
nipulated in the intended direction. Evidence on 
this was obtained from the ratings of the leader. 
Ratings on a 5-point scale ranging from “very 
directive" to “very nondirective" were asked for 
in response to the question, “How would you 
classify your leader?” Rating scores could range 
from 1 for a “very nondirective" response to 
5 for a “very directive’ response. The mean 
rating given by followers of leaders instructed 
to behave in a directive manner was 4.12 as 
compared with 3.52 for followers of nondirective 
leaders (p < .02). The manipulation thus appears 
to have produced the intended differences, al- 
though both types of leaders apparently were 
seen as more directive than nondirective. 

The time scores are the only measures of group 
performance that are comparable across tasks. 
These scores were approximately normal in dis- 
tribution, but the means and standard deviations 
tended to be correlated. Therefore, raw scores 
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TABLE 1 


Mean Time Scores (MINUTES) FOR LEADERSHIP | 
AND Task CONDITIONS 


Task 
Leadership style 
A B c 
Directive 23.42 13.36 24.67 ü 
(4.80) (3.71) (4.97) - 
Nondirective 16.76 5.29 34.73 — 
(4.03) (2.36) (5.85) 


were transformed by the square-root transforn 
tion prior to analysis. Table 1 gives the mi 
of raw scores and of transformed scores f 
leadership and task conditions. 

Analysis of variance yielded significant F 
values for tasks (F = 31.96, df = 2/32, p < OL 
and for the Leadership Style X Tasks interacti 
(F=7.54, df=2/32, p < 01). For purpos 
of this study, the significant interaction te! 
is of greatest interest, Inspection of the data 
Table 1 reveals that directive leadership ¥ 
more effective than nondirective leadership 0! 
on Task C; on both Tasks A and B nondirective 
leadership was more effective. Separate ¢ tests 
revealed significant differences between lead 
ship styles on Tasks B and C (p < .01 in ea h 
case), but the difference on Task A was not si 
nificant (p <.20). This finding only partiall 
supports the prediction that directive leaders! 
should be more effective on both Task A a 
Task C. 

Questionnaire data are given in Table 
broken down for leaders and followers tak 
separately: Although ratings of satisfaction, 
operation, group performance, and leader p! 
formance were all higher for nondirective t 
for directive leadership, differences were si 
cant only for ratings of cooperation (p < 09): 
None of the differences between leaders 4 


TABLE 2 
MEAN RATINGS or LEADERSHIP DIRECTIVENESS; 


‘a 


SATISFACTION, COOPERATION, GROUP PERFOR- 
MANCE, AND LEADER PERFORMANCE 


Directive leader | Nondirective 


Leader | Follower| Leader | Folk 


Leader directiveness | 3.78 | 4.12 | 2.94 
Satisfaction 4.28 | 4.01 | 4.34 
Cooperation 3.83 | 3.87 | 4.50 
Group performance | 4.17 | 3.84 4.33 
Leader performance | 2.84 | 3.20 3.67 


Brier ÅRTICLES 


followers was significant, except for ratings of 
leader directiveness. Leaders rated themselves 
less directive than did their followers (p < .02). 


Discussion 


The most useful results for evaluating Fied- 
lers contingency model are those obtained from 
the time scores. Under the conditions established 
in this experiment, it is cleàr that directive 
leadership was more effective than nondirective 
leadership only when the group-task situation was 
highly favorable for the leader. Fiedler's model 
predicts that the directive leader should also have 
been more effective when the group-task situa- 
tion was highly unfavorable, and to this extent 
the present findings appear to be at variance with 
the model, Obviously, this discrepancy may be 
due either to a failure to manipulate the group- 
task dimension effectively or to inadequacies of 
the model. 

The most likely explanation is that the highly 
unfavorable situation (Task A) is, at most, only 
moderately unfavorable. Favorability was manip- 
ulated only by task structure as reflected by the 
solution multiplicity task dimension. The task is 
thus unstructured in the sense that there are 
many possible solutions; however, the other 
variables postulated as determinants of favor- 
ability supposedly contributed to high favorabil- 
ity. If this reasoning is correct, the conditions in 
this experiment actually varied only from highly 
favorable to moderately unfavorable, and the 
findings agree with theoretical expectations. 

Despite the apparent correctness of this in- 
terpretation, there is some reason to believe that 
the theory can be improved. Fiedler noted that 
the model is an oversimplification, and pointed 
to the difficulties in weighting the dimensions re- 
flected by the favorability continuum. It is the 
view of the writers that the oversimplification is 
most evident with respect to the task structure 
dimension. Differences among tasks that are rele- 
vant to leadership requirements are almost cer- 
tainly not limited to differences in task struc- 
ture. For example, variations in difficulty and 
Cooperation requirements probably call for dif- 
ferent leader behaviors. Furthermore, it seems 
unlikely that the task structure dimension is un- 
ambiguously related to the favorability contin- 
uum. An unstructured task may in some instances 
be favorable for the leader in the sense that it 
encourages the members to accept his leadership. 
On the other hand, a highly structured task may 
be unfavorable to the leader in the sense that his 
leadership is superfluous. In this case, directive 
leadership might be superior to nondirective lead- 
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ership simply because his attempted directions are 
ignored, whereas the attempted permissiveness 
(encouraging contributions for all, etc.) of the 
nondirective leader would only distract the group 
from its goal. This difference in leadership ef- 
fectiveness would be unrelated to favorability of 
the group-task situation for the leader, as opera- 
tionally defined. 

In short, it is suggested that task structure is 
an important variable in the determination of 
leadership effectiveness, but that this variable is 
ambiguously related to the favorability continuum 
postulated by Fiedler. A more complete analysis 
requires a consideration of the particular task 
requirements in relation to leadership type. Fied- 
ler's (1964) discussion of coacting versus inter- 
acting groups is a step in this direction, but 
greater consideration of task dimensions is called 
for. 

Regardless of their implications for the con- 
tingency model, the results of this experiment 
show clearly that directive leadership is more 
effective than nondirective when the task is 
highly structured; that is, when there is only 
one solution and one way (or only a few ways) of 
obtaining this solution. The requirements for 
leadership are quite limited, and nondirective 
leader behaviors may only interfere with the 
problem-solving process. However, on tasks that 
require varied information and approaches, non- 
directive leadership is clearly more effective. On 
such tasks the requirements for leadership are 
great. Contributions from all members must be 
encouraged, and this requires motivating, advis- 
ing, rewarding, giving support—in short, nondi- 
rective leadership. 
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1st born and later-born adolescents were tested in conditions resembling some 
of those used by Deutsch and Gerard. In the condition most closely resembling 
the Asch situation the 1st-born Ss yielded more than the later born. In a group- 
reward condition the 1st-born Ss showed a significant increase in conforming 
errors while the later-born Ss were relatively unaffected. In a memory condition 
the later-born Ss yielded more than in either of the other 2 conditions while 
the Ist-born Ss were relatively unaffected. These findings were interpreted as 
confirming the hypotheses that 1st-born persons are more responsive to 
normative influences while later-born persons will be more affected by in- 


formational influences. 


In a recent experiment (Becker, Lerner, & 
Carroll, 1964) first-born and later-born adoles- 
cents were tested in an Asch (1956) situation. 
By introducing the anticipation of a small or 
large “payoff” for each correct judgment the 
amount of yielding was significantly affected. A 
small payoff greatly decreased conforming errors 
in the first-born group, and somewhat decreased 
conforming errors in later-born subjects. A large 
payoff, however, led to increased yielding only 
for the later-born subjects. These findings were 
interpreted as confirming the hypothesis that 
first-born persons are more dependent on others 
for social support while later-born persons rely 
more on others for validation of their beliefs. 

Confidence in the validity and generality of 
this interpretation is somewhat limited because 
of two aspects in the experimental situation. The 
first is based on the characteristics of the sub- 
jects, all of whom were golf caddies from the 
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same country club. It is probably safe to sug 
gest that they were from the lower middle cla 
and were somewhat special within that class 
terms of the kind of employment they sougl 
and achieved. 3 

The second, more serious problem derives from 
the line of reasoning concerning the effect of @ 
small versus a large payoff as a technique f 
manipulating the normative and informational in- 
fluence (Deutsch & Gerard, 1955) operating n 
the Asch situation. A small payoff was assumed 
to provide motivation to resist the normative ii 
fluence in the situation and a large payoff wi 
supposed to enhance the informational value 
the unanimous majority’s judgments. Though 
perhaps it was a logical manipulation, there Wi 


The study reported here was designed to 
date the original hypotheses by testing them 
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under different experimental conditions. It differs 
from the initial study in several important re- 
spects. First, the subjects were all students in a 
high school located in an upper income com- 
munity. Second, the techniques for manipulating 
normative and informational influences were 
changed. Instead of a control and two payoff 
conditions the design included a control condi- 
tion, a "memory" condition which was expected 
to increase the informational influence, and a 
"group-reward" condition which was intended to 
heighten normative influence. The specific tech- 
niques employed to create the “memory” and 
"group" conditions were similar to those used by 
Deutsch and Gerard when they initially made the 
distinction between these two sources of influ- 
ence. Third, in the initial study a small payoff 
reduced normative influence while a large payoff 
increased informational influence. In this study 
the selected manipulations operated so as to 
increase both normative and informational 
influences, 

Despite the contrasting experimental condi- 
tions, we expected to confirm the original hy- 
potheses that first-born persons will appear more 
dependent on other people when normative in- 
fluence is operating in a situation, whereas later- 
born persons will appear to be more dependent 
to the extent that informational influence is 
present in the situation, 


PROCEDURE 


Subjects were 48 male volunteers from a large 
high school located in an upper-middle-class suburb 
north of Chicago. Twenty-three of the subjects 
were the first-born or only children in their families 
and 25 were later-born children, Subjects were 
recruited and scheduled for the experiment during 
their study (or free) periods. 

As each subject appeared he was asked to identify 
himself and then he was informally questioned as 
to the number and age of his siblings, if any. The 
Subjects were assigned randomly to a control group, 
a group-reward group, and a memory group with 
the restriction that half the subjects run under each 
condition be the first- or only-born child in their 
family, while the remaining subjects in the three 
groups each had at least one older sibling. (One error 
in the assignment was made by the experimenter so 
that in the memory condition the first-born group 
included seven rather than eight subjects and the 
later-born group contained nine subjects.) Accom- 
plices were recruited from the same pool of volun- 
teers from which the subjects were selected. The 
School was large enough, with enough simultaneous 
study periods, so that in no case was the naive 
Subject acquainted with any of the accomplices. 

The control condition was an exact replication of 
the control condition used in our earlier study and 
also of Deutsch and Gerard's face-to-face (visual 


243 


series) solution. That is, three accomplices were 
employed with each naive subject, all of whom were 
required to make 18 judgments of which 12 were 
“critical trials,” that is, those where the accomplices 
unanimously gave incorrect answers, The subjects 
were instructed to match accurately the length of 
the standard line with one of three companion lines. 

The memory condition was like the control condi- 
tion except that the lines were removed before any 
of the choices were announced. Just as Deutsch and 
Gerard did in their memory series, we allowed 
approximately 3 seconds to elapse before asking for 
the first judgment. 

The group-reward condition was similar to 
Deutsch and Gerard’s group situation. The subjects 
were told that several groups were taking part in 
the experiment and that the group which made the 
fewest number of total errors would be rewarded 
with tickets to a Chicago Black Hawk hockey game. 
The judgments were made with the lines physically * 
present, 

These procedures allowed us to make the following 
predictions. In the control condition we expected to 
replicate the finding of Becker and Carroll (1962) 
and Becker et al. (1964) that first borns yield more 
than later borns in an Asch situation. Compared 
to control-group subjects we predicted that first-born 
subjects under group-reward conditions would 
exhibit a significant increase in yielding while later- 
born subjects under group-reward conditions would 
be relatively unchanged. We also predicted that later- 
born subjects would exhibit a significant increase in 
yielding in the memory condition compared with 
the control condition while the first borns would 
remain relatively unaffected. 


RESULTS 

From an inspection of Table 1 it can be seen 
that the predictions were upheld by the data, 
In the control condition the first-born subjects 
yielded more than the later borns (p= .058), 
However, when the subjects were led to believe 
that their responses would affect the likelihood 
of their fellow respondents winning a prize 
(group-reward condition) the first-born subjects 
showed a significantly greater degree of yielding 
than in the control condition (p= .058). The 
later-born subjects exhibited no comparable in- 
crease in yielding behavior (p > .41). The dif- 
erence between the yielding of the first borns 
in the control versus the group-reward condition 
was not significantly greater than the differences 
in yielding for the later-born subjects (¢ = .90, 
df = 28). 

In the memory condition the pattern of yield- 
ing was different than in either the control or 
group-reward conditions. The later-born subjects 
made significantly more errors in the memory 
condition than in the control condition (5 = .05) 
while the first-born subjects made fewer errors 
than in the control condition (p = .05) or in the 
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TABLE 1 
BIRTH ORDER AND CONFORMITY UNDER 
VARYING CONDITIONS 
Number of errors 
Condition 
First born Later born 

Control 


Group reward 


Memory 


aneen OORPgeIO PRWHDYNNO 
NPWWNNNEO UKWNNEOCOCO NNNNNOOO 


group-reward condition (p= .03). All p values 
were based on Mann-Whitney U tests (Siegel, 
1956). An additional test was performed to 
determine if the difference in errors made by 
the later borns in the control versus the memory 
condition was greater than the differences exhib- 
ited between the first-born subjects in these 
conditions. This comparison yielded a t = 2.365, 
df = 28, p <.05. 


DISCUSSION 


The present study differed from the initial one 
in certain essential respects. The subjects, al- 
though approximately the same age as the golf 
caddies were from families living in an extremely 
different social environment—a suburban upper- 
middle-class community versus an urban lower- 
class neighborhood. The experimental procedures 
used to manipulate the normative and informa- 
tional influences were also relatively distinct in 
the two studies. In the first study normative 
influence was reduced by a small payoff and 
informational influence enhanced by a large pay- 
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off. In this study normative influence was in- 
creased by a group reward and informational 
influence increased by compelling the subjects 
to respond from their memory of the stimuli. 


Given these differences between the two experi- ' 


ments the findings can be considered remarkably 
similar. Later-born subjects were most influenced 
when they had reason to believe that the other 
people in the situation were providing them with 
valid information about the common “reality” 
which confronted them. The first-born subjects 
remained relatively unaffected by the informa- 
tional value of the responses of their. peers. Their 
yielding behavior was significantly altered by the 
motivation induced in them to go along with or 
resist the expectations of the unanimous major- 
ity. In the initial experiment the first-born sub- 
jects were induced to resist going along with the 
majority by offering them a reward for a correct 
answer. In this second experiment they were led 
to greater conforming behavior by increasing the 
apparent need of the majority to have the 
subject go along with their choice. 

Although most of the predictions in this study 
were confirmed at marginal levels of significance, 
when the findings are considered in conjunction 
with those of Becker et al. (1964) they present 
a rather clear picture. Either the first- or the 
later-born person will appear more or less de- 
pendent upon other people as a function of the 
type of influence operating in the situation. To 
the extent that normative influence is present 
the first born will appear more dependent. If 
informational influence predominates the later- 
born person will be more affected. 
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AROUSAL OF NEED FOR AFFILIATION IN WOMEN 


HOWARD M. ROSENFELD anp SAMUEL S. FRANKLIN 1 
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100 freshman college women from a residence hall were randomly assigned 
to 1 of 3 conditions expected to affect need for affiliation (n Affiliation) or to 
a control group. In agreement with previous research on male Ss, 2 of the 
conditions—being rated by peers and being rejected by peers—resulted in 
arousal of ‘TAT n Affiliation. No effect of the 3rd condition, social acceptance, 
was obtained. Implications for a 2-factor theory of n Affiliation were discussed. 


Although the fantasy measure of need for 
affiliation (n Affiliation) described by Heyns, 
Veroff, and Atkinson (1958) has been widely 
employed in research, some basic assumptions 
concerning its validity have not been put to ade- 
quate empirical test. The present study attempts 
to determine the applicability of the measure to 
female subjects and to evaluate some theoretical 
inferences that were drawn in initial validation 
experiments, 

According to McClelland (1958, p. 9), a mini- 
mal requirement for the validation of a measure 
Of motivation is that it reflect the presence or 
absence of the motive in the individual. Experi- 
mental arousal commonly has been used as a 
direct method of determining whether fantasy 
Measures of motivation are sensitive to gross 
Variations in motive strength. Although experi- 
mental arousal of n Affiliation has been demon- 
Strated repeatedly in male samples (Atkinson, 
Heyns, & Verofi, 1954; French & Chadwick, 
1956; Shipley & Veroff, 1952) there are no pub- 
lished reports of attempts to arouse n Affiliation 
m women, Yet female subjects often have been 
used in correlational studies employing the 
Projective measure of n Affiliation, One aim of 
the present study is to determine whether n 

ffiliation can be aroused in women by the 
techniques that have proved successful in pre- 
vious studies of men. 

The present study also is concerned with the 
degree of support provided by the earlier experi- 
Mental studies for a two-factor theory of n 
Affiliation, Following McClelland’s (1951) gen- 
eral distinction between approach and avoidance 
Motives, Shipley and Veroff (1952) suggested 
that the tendency to seek affiliation (n Affiliation) 
1s a function of two subdispositions—the ap- 


* The authors appreciate the assistance they re- 
ceived in conducting various phases of the study 
from Emily Taylor, Marvin McKnight, Sandra Coty, 
and Sandra Simmons. The study was supported by 


the Bureau of Child Research at the University of 
sas, 


proach motive (pleasure of acceptance) and the 
avoidance motive (pain of rejection). Previous at- 
tempts to arouse n Affiliation have employed the 
method of having members of informal social 
groups rate each other in terms of desirability, It 
is not clear, however, whether the successful 
arousal of projective n Affiliation in these studies 
was due to fear of rejection (avoidance motive) 
as proposed by Shipley and Veroff, both fear of 
rejection and hope of acceptance (approach mo- 
tive) as proposed by Atkinson et al. (1954), or 
some other induced state such as curiosity about 
the outcome of the sociometric rating procedures.? 

A more direct criterion of the avoidance mo- 
tive—rejection—was employed in a second valida- 
tion study by Shipley and Veroff (1952), Al- 
though they confirmed the prediction that stu- 
dents who were recently rejected in fraternity 
“rush” would have higher n Affiliation than 
those who were accepted, the results of the study 
are difficult to interpret for two reasons, First, 
since the two groups of subjects were not equated 
for n Affiliation prior to fraternity rush, rejected 
subjects may have been higher in n Affiliation to 
begin with. The subsequent finding that n Affilia- 
tion is negatively correlated with popularity (At- 
kinson et al, 1954; Shipley & Veroff, 1952) at- 
tests to this possibility. Second, the accepted 
subjects appear to have received the "pleasant 
stimulus reward value of the affiliative relation- 
ship" that should arouse the approach disposition 
(Shipley & Veroff, 1952, p. 354). Thus, to provide 
a more direct test of the approach and avoidance 
interpretations of n Affiliation, the present ex- 
periment employs acceptance and rejection as 
independent arousal conditions.* 


2 Attempts have been made to distinguish between 
the approach and avoidance motives in terms of the 
manifest content of affiliative imagery (Byrne, Mc- 
Donald, & Mikawa, 1963; French & Chadwick, 1956), 
but the validity of this distinction has not been 
directly verified by experimental methods. 

3French and Chadwick (1956), attempted to 
arouse their own projective measure of n Affiliation 
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METHOD 
Subjects 


Residents from three living units of a freshman 
women's hall at the University of Kansas were 
selected as subjects. One hundred of the 117 women 
selected actually participated in the experiment. Since 
nearly all freshman women are assigned to residence 
halls on a random basis, it was assumed that the 
sample was representative of the female residence 
hall population. In many respects the living units 
resemble cohesive sorority groups. Intimate social 
bonds between women in a unit are facilitated by 
proximal living quarters, a common dining area, and 
many shared activities. Thus, they may be viewed 
as comparable in social atmosphere to the fraternity 
groups used in the experiments by Shipley and 
Veroff (1952) and Atkinson et al. (1954). 


Experimental Setting and Treatments 


The experiment was performed in three sessions 
of approximately 1 hour duration, one session for 
each living unit. All three sessions were identical. 
Two treatment rooms were used. Each room con- 
tained a loudspeaker and a number of three-sided 
individual cubicles equipped with desks and ear- 
phones. 

Subjects were randomly assigned to experimental 
conditions. The conditions were: C—a control group 
in which no attempt was made to arouse n Affilia- 
tion; So—a sociometrically aroused group in which 
subjects were administered a sociometric test, but 
received no subsequent feedback about how well 
they were liked; S— —a rejection feedback group in 
which subjects were informed that they received 
predominantly negative ratings following adminis- 
tration of the sociometric test; S--—an acceptance 
feedback group in which subjects were informed 
that they received predominantly positive ratings 
following administration of the sociometric test. 

The C and So conditions took place in one room 
and the S— and S+ conditions in the other room. 

Following preliminary instructions which requested 
the cooperation of subjects, a sociometric question- 
naire was distributed to Groups So, S—, and S+. 
The questionnaire instructed subjects to check the 
term “like,” “dislike,” or “undecided” after the name 
of each of their companions. At the same time, 
subjects in Group C were given a “neutral” task. 
The task consisted of a printed page of geometric 
forms and printed instructions requesting the sub- 
jects to add lines to the figures to make them 
personally more pleasing to look at. 

Immediately upon completion of the sociometric 
questionnaire or neutral task, each subject listened 
to background music through earphones for 10 
minutes. The listening interval was employed to 
make the subsequent feedback to Groups S+ and 
S— appear authentic; that is, the listening interval 


through sociometric acceptance and rejection, but 
did not report the outcome of the procedures be- 
cause it was apparent to them that the subjects 
did not believe the experimental manipulations. 


BRIEF ARTICLES 


ostensibly gave the experimenters time to score the 
sociometric questionnaires, 

At the end of the 10-minute period, subjects in 
the two feedback conditions (S— and S+) were 


notified via earphones that the ratings had been, 


tabulated and that they would receive information 
regarding their general standing in the group. Prior 
to these instructions subjects were not made aware 
that feedback was to be given. Each subject in the 
S+ and S— groups then received a card upon which 
appeared her name and the handwritten word “liked” 
or “disliked.” 


Assessment of Responses 


Immediately after the sociometric feedback cards 
were distributed, a group TAT test of n Affiliation 


was administered to all subjects. Standard instruc- 


tions (Atkinson, 1958, Appendix III), prerecorded 
on tape by a male voice, were delivered simultane- 
ously through loudspeakers to both experimental 
rooms, Subjects were asked to turn their chairs so 
as to face a large screen at the end of their room. 
Slide projectors in each room simultaneously pre- 
sented a series of six TAT pictures.* 

A postexperimental questionnaire, designed to pro- 
vide subjective validation of experimental opera- 
tions, was administered before the actual nature of 
the study was revealed to subjects. 


RESULTS 


Mean n Affiliation scores of the control and 
arousal groups are reported in Table 1. N Affilia- 
tion was significantly higher (p < .05) in the So 
and S— groups than in the C group. The mean n 
Affiliation score of the S+ group did not differ 
significantly from that of the C group. 

While the primary interest of the present study 
was in the effects of the experimental treatments 
on n Affiliation as it is operationally defined by 
Heyns et al. (1958), several more limited sub- 
scores derivable from responses to the TAT were 
analyzed separately. One of these was the cate- 
gory "affiliation imagery," which is given dis- 
proportionate weighting since it must be fulfilled 


+The authors are indebted to Joseph Veroff for 
providing the TAT pictures which were previously 
included in a national survey (Veroff, Atkinson, Feld, 
& Gurin, 1960) and a study of achievement motiva- 
tion in women (Lesser, Krawitz, & Packard, 1963). 
The six pictures, in order of presentation, are as 
follows: two women in a laboratory—one working 
with test tubes; woman seated by young girl who 
is reclining in chair; women in group—three seated 
facing each other, one standing in background; 
woman applying a cover to a chair; two women an 
a kitchen, preparing food; woman standing, with 
man standing behind and to the side. 

The stories were scored by a rater who had pre- 
viously established high reliability on practice stories 
provided in Atkinson (1958). 
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E^ TABLE 1 
MEAN n AFFILIATION SCORES 


h 4 1 Group Ne n Affiliation 
C (control) 26 5.04 
So ‘sociometric—no feedback)| 26 6.85* 
S— (sociometric—rejection) 20 T.35* 
S+ (sociometric—acceptance) | 22 6.18 


£ 


~ “Three subjects in the S+ and three in the S— conditions 
essed disbelief in the veracity of the feedback and were 
( excluded from the analysis. 
* Difference from contro! group, in direction of hypotheses, 
significant beyond .05 level by 1 test. 


E before further categories can contribute to the 

1 „total n Affiliation score. A second subscore was 

composed of the remaining categories scored by 

. Heyns et al. The third subscore consisted in those 

categories which were excluded from the n Affilia- 

— tion scoring system by Heyns et al. because of 

their inability to differentiate arousal and control 

conditions in the study by Atkinson et al. (1954). 

This last set of categories, presently referred to 

as "negative n Affiliation,” consisted in negative 

^ instrumental activity, negative goal anticipation, 
negative affective state, and personal obstacle. 

When the single category “affiliation imagery” 

was tested as the dependent variable, the results 

Were similar to those reported above "for n Affili- 

ation but slightly more significant (p < .02 for 

the So and S— groups, p>.20 for the S+ 

= group).5 The remaining set of categories con- 

S tributing to the n Affiliation score also reflected 

arousal in the So (5 <.06) and S— (p < 025) 

jr groups, but not the S+ group (> .20). Finally, 
į 


» hone of the experimental groups differed signifi- 
cantly from the C group in negative n Affiliation 
Scores, 


Discussion 


The results of this study indicate that the pro- 
jective Measure of n Affiliation is applicable to 
Women, As in earlier studies of men (Atkinson et 
al, 1954; Shipley & Veroff, 1952) subjects who 
mad e sociometric ratings of each other without 
feedback about the outcome (So group) subse- 
und scored higher in n Affiliation than did 
sub jects who performed a neutral task (C group). 
Us the inclusion of female subjects in studies 
I * deed this measure of n Affiliation, and the 


| Eu finding that the single scoring category 


tion imagery" was as sensitive to differences 
„Experimental conditions as was the total n Af- 
lation score is in agreement with the results of 
‘Previous arousal studies (Atkinson et al., 1954; 
ipley & Veroff, 1952). The consistency of this find- 
B suggested that for purposes of assessing general 
3 for affiliation, it is sufficient to score "affiliation 
2 ery” alone. 
Y 
E 
E 3 
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generalization of results from previous male stud- 
ies to women, would appear to be justified. The 
present study may also be viewed as a cross- 
validation of the n Affiliation scoring system 
(Heyns et al., 1958) that was derived from the 
results of the arousal study of Atkinson et al. 
(1954). 

The finding that rejected subjects (S— group) 
were significantly higher in n Affiliation than 
control subjects supports the conception of n 
Affiliation as an avoidance disposition. Contrary 
to what one would expect on the basis of face 
validity it was found in the present study that 
rejection increased the “positive” but not the 
“negative” categories of affiliative fantasy. This 
outcome is consistent with the theory that both 
the approach and avoidance dispositions of n 
Affiliation lead to a search for positive affective 
relationships (Atkinson et al., 1954; Shipley & 
Veroff, 1952). However, it warns against literal 
inferences of differential motivating states on the 
basis of manifest content of imagery. 

Since socially accepted subjects (S+ group) 
produced n Affiliation scores that were not sig- 
nificantly higher than those of the control group, 
this study does not support the approach theory 
of n Affiliation. However, social acceptance could 
have performed at least three functions—relief 
of existing avoidance motivation, satisfaction of 
existing approach motivation, and stimulation of 
a renewed level of approach motivation (Mc- 
Clelland, 1951). Although the majority of sub- 
jects in the acceptance condition clearly indicated 
that they were pleased with the outcome, ac- 
ceptance may have satisfied existing approach 
motivation but failed to induce further anticipa- 
tions of pleasurable social interaction. Another 
possibility is that acceptance reduced not only 
the level of n Affiliation aroused by the preced- 
ing sociometric rating procedures, but also the 
level of n Affiliation that existed in subjects 
when they first entered the experimental setting. 
If so, acceptance might have reduced more n 
Affiliation than it aroused. For the approach in- 
terpretation of n Affiliation to be upheld, it 
remains to be demonstrated that heightened 
anticipations of social acceptance can arouse n 
Affiliation under conditions in which avoidance 
motivation is controlled. 
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RECOGNITION THRESHOLDS FOR VALUE-RELATED WORDS: 


DIFFERENCES BETWEEN INNER-DIRECTED AND OTHER- 
DIRECTED SUBJECTS 
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Value-related words were presented tachistoscopically to 24 Ss selected’ on the 
basis of extremity of score on the Bell Questionnaire for inner- and other- 
directedness. Among inner-directed Ss there was a significant inverse relation- 
ship between the strength of preference for a given value category and the 
time required to recognize words related to that category. The degree of 
association between these 2 variables was significantly greater for inner- than 


for other-directed Ss. 


Riesman’s (1950, 1952) interpretation of con- 
temporary American culture rests principally 
upon a distinction between two character types: 
“inner-directed” and “other-directed.” The inner- 
directed man is described by Riesman as guided 
from within by a gyroscope of internalized goals 
and values which are implanted in childhood and 
which keep him on course in spite of environ- 
mental pressures to deviate. In contrast, the 
other-directed person is described as guided 
chiefly by his sensitivity to the expectations of 
his peers; his source of values is character- 
istically the peer group. This difference in social 
character type might be expected to involve a 
difference in perceptual sensitivity to certain 
classes of stimulus objects—a difference which 
depends on the extent to which the objects are 
sought as sources of need satisfaction and as 
guides to appropriate conduct. Among the stimu- 
lus objects for which a difference in sensitivity 
ought to exist are value-related words. 

Postman and Schneider (1951) found that 
the higher the preference of an observer for a 


value category, the lower will be his recognition 
thresholds for words related to that value cate- 
gory. It is proposed that the strength of this 
effect will differ for inner- and other-directed 
persons. The direction of the difference can be 
predicted on the basis of certain additional fac- 
tors which have been found to determine recog- 
nition thresholds in ambiguous stimulus situa- 
tions. Among these factors are the frequency 
with which the stimulus objects have been en- 
countered in the past, and the search require- 
ments imposed on the observer by his current 
needs and enterprises, which in turn will affect 
the number of alternative possibilities for which 
the observer is set (Allport, 1955; Bruner, 1957; 
Postman, 1951). 

According to Riesman inner-directed persons 
strive consistently for objects related to their 
preferred value categories and tend to retain 
throughout their lives about the same ranking of 
these categories. This tendency is related to 
their individualism and to their disposition to 
choose friends whose values are the same as their 


f 
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own. For example, the inner-directed person who 
is interested in theology will tend to read books 
about theological matters, listen to sermons and 


, lectures on religion, choose friends who share 


his interests, engage in conversations about 
theology, and will continue to retain this basic 
interest, From this it follows that inner-directed 
persons will frequently respond to stimulus ob- 
jects related to their high value categories, at 
least insofar as these objects are the achieved 
goals of their own activity. 

In contrast, the other-directed persons are 
more likely to choose their activities in accord- 
ance with the interests of others, and they may 
even modify their own interests as changes occur 
in the peer group. Compared with their inner- 
directed counterparts, other-directed persons will 
have less frequent experience with stimulus ob- 
jects related to their most preferred value cate- 
gories, 

Since recognition threshold is inversely related 
to frequency of past perception, it follows that, 
for objects related to preferred value categories, 
an inner-directed person should have lower rec- 
ognition thresholds than should his other-directed 
counterpart. 

There are at least two reasons why inner- 
directed persons will be primarily concerned 
with searching out objects related to their high 
value categories. These objects provide them 
with cues as to how they ought to conduct them- 
selves and attainment of the objects provides 
for greater need satisfaction. The recognition 
thresholds for such objects ought to be low in 
Proportion to their value as guides and goals. 
For other-directed persons, on the other hand, 
need satisfaction is related more to perceiving 
the values of others than to achieving objects 
belonging to any one particular value category. 
Qther-directed persons must be ready to recog- 
nize objects belonging to a variety of value 
categories. Consequently, compared with their 
Inner-directed counterparts, they should have 
relatively high recognition thresholds for objects 
related to their own high value categories. This 
follows from the fact that the recognition 
threshold for objects belonging to a particular 
Category is raised when the observer is set, not 
Just for that category, but for alternative cate- 
Bories as well. 

; From these considerations, two hypotheses fol- 
low: 

1. For inner-directed subjects, the higher the 
Preference for a value category, the less will be 
the stimulus input (e.g., the shorter will be the 
exposure time) required for the correct percep- 
tion of stimuli falling in that value category. 
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2. The correlation between degree of pref- 
erence for a value category and the amount of 
stimulus input required for the correct percep- 
tion of stimuli falling in that value category will 
be greater for inner-directed than for other- 
directed subjects. 


METHOD 


The testing of the hypotheses involved three main 
procedures: the selection of subjects to represent the 
extremes of inner- and other-directedness, the de- 
termination of their value preferences, and the meas- 
urement of recognition thresholds for value-related 
words. 


Test Instruments 


When the present study was undertaken in 1958, 
the Bell Questionnaire was the only validated test 
available for measuring inner- and other-directed- 
ness (Bell, 1955; Sofer, 1961).1 The abbreviated ver- 
sion, used in the present experiment, is a self-ad- 
ministered, paper-and-pencil test consisting of 26 
items most of which describe hypothetical situations 
in which a fictional person is faced with the choice 
of behaving in accordance with his own values or of 
yielding to peer-group pressure. The Bell Question- 
naire and the Allport-Vernon-Lindzey Scale of 
Values (Allport, Vernon, & Lindzey, 1951) were 
administered to the 69 students enrolled in a sum- 
mer-session class in general psychology. The testing 
was done during regularly scheduled class discussion 
periods by the assistant instructor assigned to the 
class. Nothing was told to the students which would 
have enabled them to associate the tests with the 
later experiment on word recognition. 


Subjects 


The 15 students who obtained the highest test 
scores (other-directedness) and the 15 who obtained 
the lowest test scores (inner-directedness) were se- 
lected as potential subjects. In response to a tele- 
phone call from the experimenter, 13 of the 15 most 
other-directed students, and 12 of the 15 most 
inner-directed students, volunteered to participate in 
the experiment, Later 1 of the other-directed sub- 
jects had to be dropped because another commit- 
ment made it impossible for him to remain until the 
completion of his experimental session. Thus the 
experimental groups consisted of 12 inner-directed 
and 12 other-directed subjects. 

The inner-directed subjects were found to be 
roughly equivalent to the other-directed subjects 
with respect to intelligence, age, and school class. 
The chief difference between the two groups was in 
regard to sex: Although the number of males was 
equal to the number of females in the other-directed 
group, there were four more females than males in 
the inner-directed group. 


1 Peterson (1964) has recently challenged the uni- 
dimensionality of scales of inner- and other-directed- 


ness. 
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Procedure 


The procedure used in the word-recognition ex- 
periment was a modification of that described by 
Postman and Schneider (1951). Eighteen words 
equated for length and frequency of usage were 
typed in capital letters on separate 4X 6 inch white 
cards and presented by means of a modified Dodge 
tachistoscope. Because Postman and Schneider had 
found that personal values manifested themselves 

- most clearly in the perception of their infrequently 
used words, only those words were employed in the 
present experiment. 

Inner-directed and other-directed subjects par- 
ticipated in the experiment in alternating order. 
When the subject entered the experimental room, 
his attention was directed to the tachistoscope, and 
he was given the following instructions: 


This is a device for presenting words for very 
brief exposures. I will start out with an exposure 
so brief that you can hardly see anything but a 
flash, and then I will increase the time of ex- 
posure until you can see the word clearly. We 
know that when words are presented at very 
brief exposure intervals, people are able to form 
some impression of the word, even though they 
may not see it correctly. Each time I expose the 
word, I want you to make the best guess you can 
«as to what the word is. I will keep on presenting 
the word until you get it right. Then I will go on 
to the next word. 


The value-related words were then presented 
tachistoscopically, one at a time, in random order, 
beginning with an exposure of .01 second and in- 
creasing in even steps of .01 second until recognition 
occurred. (The longest exposure time required .08 
second.) Three trials were given at each tachisto- 
scopic exposure speed. If the subject did not make a 
response after each stimulus presentation, the experi- 
menter asked, “What did that look like?” If this 
did not elicit a response, he asked “Can you make 
a guess?” In 93% of the presentations, some response 
was eventually made by the subject. 

The first word presented to each subject was a 
practice word, “good,” which was used in order to 
accustom the subject to the procedure and to make 
certain that the instructions had been understood. 

After all subjects had participated in the experi- 
ment, the purpose of the study was explained to 
the introductory psychology discussion sections in 
which the subjects were enrolled. Questioning of the 
students revealed that before the purpose of the 
study had been explained, no subject had discerned it, 
nor had anyone even been aware of the idea of 
value categories in connection with the words which 
were presented tachistoscopically. Most subjects had 
thought the experiment to be a test of reading abil- 
ity, and nothing more. 


RESULTS 


The hypotheses were tested in the following 
manner: For each subject a total recognition- 
time score was obtained for each of the Allport- 
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Vernon-Lindzey value categories by summing 
the durations of the tachistoscopic exposure 
times at which the subject correctly recognized 


the three words representing that value category. , 


A Kendall rank correlation coefficient (tau), 
corrected for ties, was computed for each subject 
relating recognition-time scores, ranked in order 
of increasing magnitude, to value scores, ranked 
in order of decreasing preference. For inner- 
directed subjects the coefficients ranged from 
— 26 to +.86 with a median of + .37. Coeffi- 
cients for other-directed subjects ranged from 
— ,60 to +.50 with a median of — .19. 

It was assumed that if Hypothesis 1 were 
correct, the correlation coefficients for inner- 
directed subjects would be predominantly posi- 
tive. The obtained coefficients were tested by 
means of a binomial test (p = .003, one-tailed 
test). The results were consistent with the hy- 
pothesis and it was concluded that for inner- 
directed subjects, the higher the preference for a 
value category, the shorter the exposure time 
required to recognize words related to that cate- 
gory. 

Tt was assumed that if Hypothesis 2 were 
correct, the correlations. between ranked total 
recognition-times and ranked value preferences as 
defined above would be higher for inner-directed 
than for other-directed subjects. All correlation 
coefficients were ranked in order of increasing 
size, taking the algebraic sign into account, A 
Mann-Whitney U test indicated that the coeffi- 
cients of other-directed subjects were signifi- 
cantly lower in rank than the coefficients of 
inner-directed subjects (p < .01, one-tailed test). 
This was the case even though coefficients of 
variation, computed for the recognition-time 
scores, tended to be larger for other-directed 
than for inner-directed subjects. Therefore, it 
was concluded that the tendency for an increase 
in value preference to be correlated with a de- 
crease in word-recognition time is less for other- 
directed subjects than for inner-directed sub- 
jects. 

There are several secondary findings worthy of 
note. Postman, Bruner, and McGinnies (1948), 
Blake and Vanderplas (1950), and Postman and 
Schneider (1951) found that the stronger the 
preference for a given value, the shorter the 
tachistoscopic exposure time required for the 
correct recognition of words related to that value. 
The results of the present experiment indicate 
that this finding does not hold for a population 
composed of both inner-directed and other-di- 
rected subjects. A binomial test revealed that 
when the data for other-directed and inner 
directed subjects are combined, there is not à 
significantly larger number of positive than © 


ative correlations between increasing strength 
yalue preference and decreasing recognition 
e (p=.271, one-tailed test). The median 
correlation was found to be +.035. 

— An analysis of subjects’ prerecognition re- 
onses which will not be reported in detail 
rovided no support for any of the findings of 
P tman et al. Nor were any significant differ- 
nces between inner-directed and other-directed 
ects found with regard to these responses. 


ù DISCUSSION 
ji _ Two potentially serious objections might be 
ised concerning the present experiment. The 
St has to do with possibilities for confounding 
"which arise from the fact that recognition times 
ight differ for words belonging to different value 
tegories, The second is concerned with sex dif- 
rences. 
In order to determine whether recognition- 
_ time scores were related to value category a re- 
peated-measurements analysis of variance was 
carried out. The results are reported in Table 1. 
$ was found that words belonging to different 
- Value categories require significantly different 
- amounts of time to be perceived correctly (p < 
— 01). This may indicate that words belonging to 
one value category are more difficult to perceive 
_ than words belonging to another value category, 
or that some other uncontrolled factor is cor- 
- related with value category and has a systematic 
_ effect on word-recognition time, In either event 
it gives rise to the possibility of confounding. 
For example, if the least difficult words tend to 
be those preferred by inner-directed subjects, 
- this would be sufficient to account for the fact 
that inner-directed subjects have a significantly 
larger number of positive than negative corre- 
- lations between an increase in value preference 
. and a decrease in word-recognition time. It 
- Would also be sufficient to account for the fact 
that the correlations of other-directed subjects 
are significantly smaller, considering algebraic 
sign, than the correlations of inner-directed sub- 
T jects. 


TABLE 1 


____ REPEATED-MEASUREMENTS ANALYSIS OF VARIANCE OF 
zn Wonp-RrcooNiTION TIMES FOR 
VALUE CATEGORIES 


Source df MS F 
‘Thner-other (A) 1 | 36000 3,300 
Between subjects 22 10.908 

Value category (B) 5 15.478 7.427% 
BXA 5 1483 | <1 
dual 110 2.084 
*»«01. 
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An adequate test of the hypotheses would re- 
quire that the intrinsic relationship between 
value rank and recognition-time scores be de- 
termined by removing the possible influence of 
variations in the difficulty of the words defining 
the various value categories. This was accomp- 
lished by computing a Kendall partial rank- 
correlation coefficient for each subject. These 
coefficients were then analyzed to provide a more 
refined test of the original hypotheses. 

Among inner-directed subjects the new values 
ranged from — .50 to 4-.84 with a median of 
-F.38. Ten of the obtained coefficients were 
positive, Coefficients for other-directed subjects 
ranged from — .64 to 4-.49 with a median of 
—.04. Four of the obtained partial rank-cor- 
relation coefficients were positive. 

The prediction that the correlation coefficients 
of inner-directed subjects would be predomi- 
nantly positive was tested once again by means 
of the binomial test. The results were again 
found to be consistent with the prediction (p 
= .019, one-tailed test). 

The second hypothesis involving the prediction 
that for inner-directed subjects there would be 
a higher positive correlation between recognition- 
time scores and value ranks than for other- 
directed subjects was retested by means of the 
Mann-Whitney U test. The obtained value was 
highly significant (p< .001, one-tailed test). 
Hence it can be concluded that the tests of Hy- 
potheses 1 and 2 were valid in the sense that 
the results cannot be explained on the basis of 
differences in the difficulty of words belonging 
to different value categories. 

The failure to obtain groups of inner- and 
other-directed subjects:that were matched with 
respect to sex gave rise to the possibility that 
the results of the study were a reflection of 
sex differences rather than social character type. 
The effect of sex difference was examined by 
means of a Mann-Whitney U test. It was found 
that the correlation coefficients of inner-directed 
females were not significantly different in rank 
from those of inner-directed males (p> .50, 
two-tailed test). For inner-directed and other- 
directed subjects taken together, a Mann- 
Whitney U test showed that the correlation co- 
efficients of females were not significantly dif- 
ferent in rank from those of males (p= .808, 
two-tailed test). 

To summarize, it has been shown that: for 
inner-directed subjects, the higher the preference 
for a value category, the shorter the exposure 
time required to recognize words related to that 
category; and this effect is greater for inner- 
directed subjects than for other-directed sub- 
jects. These results cannot be explained on the 


f sex differences or on the basis of dif- 
ferences in the difficulty of words belonging to 
different value categories. 

In, deriving the experimental hypotheses, cer- 
tain assumptions were made about the perceptual 
sets of inner- and other-directed persons. Ries- 
man has characterized the inner-directed person 

-as one whose behavior is guided throughout life 
primarily by unchanging loyalty to a given cate- 
gory of values such as power, wealth, knowledge, 
or holiness, In the present study it was argued 
that such a person comes to have strong sets 
to perceive objects relating to his preferred 
values; that is, he comes to see the world pri- 
marily in terms of his dominant value commit- 
ment. Riesman also described the other-directed 
person as being guided in his behavior primarily 
by the expectations of others. It seemed reason- 
able to believe that such a person is not set 
to perceive the stimulus situation exclusively in 
terms of his preferred values but has a larger 
number of alternative perceptual sets which 
render him more sensitive to the differing values, 
and hence to the differing expectations, of others. 

The recognition-time findings support the as- 
sumptions about the perceptual sets of inner- 
and other-directed persons. If these assumptions 
are correct, they are of interest because they 
may indicate the operation of perceptual mecha- 
nisms which maintain the modes of adaptation 
characteristic of inner- and other-directed per- 
sons. For example: Because the inner-directed 
person lacks relatively strong alternative per- 
ceptual sets, he might be expected to find it 
difficult to perceive accurately the feelings and 
expectations of others. This inability will force 
him to rely on his own value commitment for 
guidance, even though he might wish to be more 
responsive to the expectations of others. His 
inferior perception of others will make it dif- 
ficult for him to get along with them, and this 
will strengthen his tendency to invest his energy 
in the mastery of things or ideas rather than in 
interpersonal relations. The things or ideas to 
which he devotes his attention will be those 
related to his dominant value commitment. Be- 
cause a person comes to expect to perceive what 
he does in fact perceive, there will be a strength- 
ening of the inner-directed person's set to per- 
ceive stimuli related to his high value. This 
strengthening will be at the expense of sets to 
perceive objects related to alternative values. 
Thus there is a self-perpetuating system. 

In the case of the other-directed person, a 
superior ability to perceive the feelings and 
expectations of others will make interpersonal 
relations pleasant and will thus reinforce this 
tendency to look to other people for satisfac- 
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tions? Additional characteristics of the others 
directed person—his adoption of the tastes of 
others, his tendency to “conform in the very 
quality of his feelings"—will follow in part from : 
his more extensive contacts with others as well 
as from his superior sensitivity to their feelings. 
Many characteristics which Riesman ascribes 
to inner- and other-directed persons may be 
explained in a similar manner in terms of the 
creation of certain kinds of perceptual sets and - 
in terms of the tendency of these sets to help y 
perpetuate the behavior which created them. 


2 These speculations are consistent with Kassar- | 
jian's (1962) report that students majoring in the | 
natural sciences and the humanities tend to be | 
inner-directed whereas those majoring in education, 
business, and medicine tend to be other-directed, and 
our own unpublished findings that students who 
graduate in natural sciences and engineering tend to 
be inner-directed and those graduating in education, 
medicine, and business tend to be other-directed. 
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Sex of E, sex of child, withdrawal of attention, and their interactions were 
tested for their effects on resistance to temptation in 112 4-yr.-old children. 
The measure of resistance was obtained by testing whether or not a child, 
when left by himself and tempted to get a high score by cheating, would 
play a game according to rules established by an adult. Children conformed 
to the rules more with an adult of the opposite sex. Withdrawal of attention 
increased cheating for boys but had no effect on girls. Several possible inter- 
pretations of these results are considered. The desire to please a cross-sex adult, 
the arousal of achievement motivation by a same-sex adult which may be 
increased in boys by withdrawal of attention, and resentment at withdrawn 
attention on the part of boys, could be involved either singly or in combination. 


This experiment is part of a larger study 
designed to investigate the factors influencing 
resistance to temptation. A paper has already 
reported the relationships between child-rear- 
ing practices and temptation behavior and in- 
dicated the complex nature of this area of 
Study (Burton, Maccoby, & Allinsmith 1961). 

The theoretical bases for this experiment 
came mainly from modifications of Freudian 
identification theory as delineated by Allin- 
smith (1954), Maccoby (1959), Sears, Mac- 
«coby, and Levin (1957), and Whiting (1954, 
1959), and the work on nurturance-with- 
drawal of Hartup (1958). Sears et al. pro- 
Posed a theory of identification which predicts 
that girls will be more strongly identified with 
their mothers than boys will be with their 
fathers. The rationale for this theory is that 
the mother is the main agent for important 
Tesources, especially for nurturance and dis- 
Cipline, during infancy and early childhood 
Tegardless of the sex of the child, and is there- 
fore the first object of identification for both 


:1We express our appreciation to the staff of the 
Laboratory of Human Development of Harvard Uni- 
Versity, especially to John W. M. Whiting, for many 
Constructive comments on this study. 


boys and girls, She continues in this role for 
her growing daughter; but the father becomes 
more and more the boy’s model for identifica- 
tion as he takes a more active disciplinary role 
in his son’s life, and as he possesses more of 
the skills his maturing son wants to have. 
From this theory the prediction was that girls 
would conform to standards established by an 
adult experimenter more than would boys and 
that this would be especially so when the ex- 
perimenter was a woman. Furthermore, a fe- 
male experimenter should produce more con- 
formity than a male experimenter in both 
boys and girls who are only 4 years old. 
Using the work of Hartup (1958) as a 
basis, we also hypothesized that withdrawal of 
attention by a formerly nurturant experi- 
menter would arouse dependency anxiety 
which would mediate the motive to reestablish 
a nurturant relationship with the experi- 
menter. We would expect, from this reason- 
ing, that interrupting the attention the ex- 
perimenter paid to the subject would pro- 
duce greater identification with the experi- 
menter’s rules and thus greater conformity to 
such rules than would continuous attention. 
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These predictions assume that: acceptance 
of adult standards will be mediated by the 
identification process; at this early age there 
has not been enough time for the boy to shift 
his main object of identification to his father; 
there will be generalization from the child's 
mother and father to the female and male ex- 
perimenters, respectively. 3 

The standards established in the experi- 
mental situation were rules of a very simple 
game. The adult experimenter taught the 
child subject these rules and the child was 
then tempted to deviate from these rules in 
order to get a “good score.” This experiment 
was designed to answer the following questions 
concerning the sex of the child subject, the 
sex of the experimenter, and withdrawal of 
attention: 


1. Is there an overall difference between 
4-year-old boys and girls in conforming to the 
rules of a game? 

2. Does the sex of the experimenter affect 
the behavior of the child in a resistance to 
temptation setting? 

3. Does the sudden withdrawal of attention 
just prior to the temptation test affect the 
child’s behavioral tendencies to resist tempta- 
tion or to deviate from the rules? 

4. Are there any interaction effects from 
sex of the subject, sex of the experimenter, 
and withdrawal of attention on resistance to 
temptation? 


METHOD 
Subjects 


The 112 children in this study were all 4 years old 
and enrolled in private nursery schools. They came 
from well-established, middle-class homes, with well- 
educated parents. The fathers were professional men, 
executives in business, or graduate students. None of 
the children showed any noticeably “abnormal” 
characteristic. 


Procedure 


The experimenter brought the subject, individually, 
from the classroom to the room used for the experi- 
ment, telling him, “We have a game for you children 
to play, and it is your turn.” The experimenter talked 
with the subject while walking to the experimental 
room and tried to be warm and friendly during this 
time. In a few cases, the teacher had to accompany 
the child to the testing room, but she remained only 
a minute or two until the subject became fascinated 


with operating the game. 
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The game consisted of a 1X4 foot board with 
five lights which came on, accompanied by a chime, 
whenever a string behind the board was hit. One 
light came on at a time and remained lit until the 
experimenter reset the game. The rules of the game 
were to stand on a foot marker, placed about 5 
feet from the game, and to try to hit the string with 
bean bags which were to be thrown only once over - 
the front board. The game was placed against a 
backstop so that all bags which went over the 1 foot 
high board would land somewhere near the string. — 
Since the string was behind this front board, the 
subject could not see whether or not his bag ac- - 
tually hit the string when he was standing on the 
marker. In fact, a hidden experimenter (Es) com- 
pletely controlled these lights and chimes. He was 
behind a one-way mirror in a portable observation. 
booth which could be installed in whatever room 
the school let us use for testing. 

A standardized script was followed in showing the 
subject how the game "worked" and in teaching him 
the rules of standing on the marker when throwing 
and of throwing each bag only once over the board. 
All subjects received the same schedule of three — 
"hits out of the possible five, for two practice 
games. After the subject clearly demonstrated he. 
understood the rules, E: took the subject to a nearby 
table and said, “We'll play with this again later. 
Now I have another game to show you." This new 
activity, with which the child was to be engaged for 
a 3-minute period, consisted of little plastic pieces 
which could be fitted together by their ball-and-- 
socket connections to make Walt Disney animals. ' 

After 1 minute of play during which E; was - 
very nurturant and attentive to the subject, Es sig- 
naled Ei in the event that this subject was to receive 
interrupted instead of continuous attention. The — 
treatment for the subject was kept from E: to avoid | 
any influence such knowledge might have had on his 
behavior during the first part of the procedure. Es 
tapped his pencil once against the wall to signal in- 
terrupted attention. E; would then go to another. 
table, without any explanation to the subject, and 
start to fill out a rating form on the subject’s be-. 
havior, In response to the subject's questions Or 1 
quests, E: said, “You go ahead and play. Im busy, 
or “I have some work to do now." The attempt was 
to make as much contrast as possible between 
“warm” and “cold” relationships during this inter 
rupted play treatment.2 At the end of 2 minutes, Éi 
said, “Well, that’s done,” and returned to the subject. 
For the continuous attention treatment, no signal 
was given to E; so that he continued to play very 
nurturantly with the subject for the 3 minutes. i 

At the end of this period, for both interrupted and 
continuous attention treatments, E: put the constr 
tion toys in a box and out of sight and reach of 
subject while saying, *Now we'll play the bean 
game again." A tray of toys, pretested for ati 
tiveness, and including items of appeal to both se 


2We thank Judy F. Rosenblith who gave us § 
gestions about the design of the study in regard t0 
withdrawal of attention. 1 
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TABLE 1 


Mean NUMBER OF Bacs CORRECTLY THROWN 
BEFORE DEVIATION (TOTAL SAMPLE) 


Sex of subjects: Boys Girls 
Sex of experimenter:| Male | Female | Male | Female 
Treatment 
Continuous 5.3 6.1 5.8 44 
attention (n=11)| (n224)| (n=10)| (n —26) 
Interrupted 8.7 49 5.6 45 
attention (n=11)| (n—10)| (n=10)| (n —11) 


as well as sex-neutral toys, was uncovered and the 
subject was told that he could win the toy of his 
Choice if he *got enough lights on." The subject was 
then asked which one he would choose if he should 
get enough lights to win the prize. This was done 
to insure that all subjects focused on a toy they 
would really like. This was our method of maxi- 
mizing temptation in order to control for differential 
arousal to deviate from the rules. E; tested the 
subject on whether he knew the rules for the game, 
and, if necessary, reviewed them with him. Just as 
the subject was about to play the game, E: looked 
at his watch and said, *I have to go out to make 
a telephone call, but you go ahead and play the 
game according to the rules while I'm gone.” To 
eliminate fear of being caught, E; took the subject 
to the door and showed him how he was to hook 
the door so that no one could come in and "bother" 
him while he was playing the game. E; said he 
would knock when he returned. All children under- 
stood the instructions and locked E; out as we in- 
tended. 

Only one light, after the second throw, was given 
to the subject during the 3-minute test period if he 
followed the rules. Additional lights were given for 
each act of breaking the rules: stepping forward, 
Moving the foot marker, retrieving bags that had 
already been thrown and rethrowing them, and 
hitting the string with the hand. During this test 
Period, E; recorded the subject's behavior and con- 
trolled the lights. 

After the 3 minutes, E, returned, knocked on the 
door for admittance, and said to the subject "Let's 
Dlay the game again and this time will be for the 
prize.” E, ignored the lights the subject obtained dur- 
ing the test period. If the subject indicated he 
wanted E; to consider that score for the prize or for 
Some sign of approval, E; said, “You certainly know 
how to play the game now, and this time will be 
for the prize.” This last game was played to have a 
Check on whether the subject really understood and 
Would follow the rules with E; present, to eliminate 
any guilt or feelings of failure which might have 
Tesulted from the subject's behavior during the test 
Period, and to avoid reinforcing any cheating? 


>For discussions of differences in ability, of the 
different motives which might be aroused and which 
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Resistance to Temptation Measure 


The resistance measure was a 7-point scale based 
on a count of the number of bags the subject threw 
correctly before deviating from the rules. If the 
subject deviated immediately, he received a score of 
1. His score was 2 if he threw one bag correctly and 
then deviated. If he threw all five bags correctly and 
then cheated, he had a score of 6. If he never deviated 
during the test period, his score was 7. The scoring 
reliability of this measure was almost perfect.* 


RESULTS 


Table 1 gives the means and 7’s for each 
group in the experimental design. Bartlett’s 
(1937) test indicated there was nonhomogene- 
ity of variance among the cells, which, with 
unequal ws, precluded a straight analysis of 
variance with appropriate weightings for each 
cell. Table 2 shows the means of a reduced 
sample with each group having 10 subjects. 

Table 3 presents the results of an analysis 
of variance of the reduced sample. 

Tt can be seen that none of the main effects 
were significant by themselves. The only sig- 
nificant F ratio is the interaction between sex 
of subject and sex of experimenter. An inspec- 
tion of the means shows that this result is 
due to greater resistance in the cross-sex 
groups. 

The means of the different groups in Tables 
1 and 2, however, indicated that boys and 


might produce individual differences in temptation, 
and of our attempt to control for such differences by 
the described procedures, see Burton et al. (1961). 

4 Later research has shown that this test has very 
high test-retest reliability: 19 of 20 children, tested 
1 week apart, either conformed to the rules or devi- 
ated both times. Among those who deviated, there is 
a tendency (not statistically significant) upon second 
testing to deviate faster in actual time and in fewer 
bags correctly thrown before first deviation, 


TABLE 2 


Mean NUMBER or BAGS CORRECTLY THROWN 
BEFORE DEVIATION (REDUCED SAMPLE) 


Sex of subjects: Boys Girls 


Sex of experimenter:| Male | Female} Male | Female 


Treatment 


Continuous attention $6 5.9 5.8 | 43 


Interrupted attention | 3.7 49 5.66 | 44 


Note.—n = 10 in each group, 
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TABLE 3 


ANALYSIS OF VARIANCE OF NUMBER OF BAGS 
CORRECTLY THROWN BEFORE DEVIATION 
(REDUCED SAMPLE) 


Source af SS MS F 

Sex of subject (A) 1 .00| .00 .000 
Sex of experimenter (B) | 1 1.80} 1.80 392 
Treatment (C) 1| 11.25] 1125 | 2.449 
AXB 1 | 22.05) 22.05 | 4.799% 
AXC 1 9.80} 9.80 | 2.133 
BXC $ 1.80| 1.80 .392 
AXBXC $ 45| .45 .098 
Error 72 |330.80| 4.5944 

"Total 79 |377.95 

*p «.05. 


girls should be analyzed separately, especially 
in regard to any treatment effects. As there 
is no indication from these means of any 
interaction between sex of experimenter and 
type of treatment, we have returned to the 
full sample to make two-way comparisons of 
the groups controlling on sex of subjects. 
Table 4 summarizes these contrasts. These 
contrasts indicate there is a cross-sex effect 
for both boys and girls—such that there is 
more cheating with a same-sex experimenter 
and more conformity to the rules with an 
opposite-sexed adult. Furthermore, the treat- 
ment effect remains significant for boys and, 
though in the opposite direction, is clearly not 
significant for girls. Continuous attention for 
boys produced a greater abiding by the rules 
and interrupted attention produced deviation 
from the rules. 

In order to assess the effect of individual 
experimenters, comparisons were made for the 


TABLE 4 


DIFFERENCES BETWEEN MEANS OF GROUPS 
IN TOTAL SAMPLE 


Contrast (Means) t df 


Boys 


Male experimenter (4.524) versus 2.289* 53 
female experimenter (5.735) 

Continuous (5.829) versus 2./45** | 53 
interrupted (4.300) 

Girls 

Male experimenter (5.700) versus | 2.07* 55 
female experimenter (4.459) 

Continuous (4.806) versus 405 55 


interrupted (5.048) 


R. V. Burton, W. ALLINsMITH, AND E. E. Maccosy 


three male experimenters and the seven fe- 
male experimenters. There were no significant 
differences among any of these within-sex 
experimenter comparisons. 


Discussion 


Referring to our original predictions, it is 
clear that 4-year-old girls did not abide by 
rules more than boys, and that the sex of 
experimenter was significant only in inter- 
action with the sex of subject, Further, with- 
drawal of attention had an effect only on boys 
and this effect was opposite to what would 
be expected were dependency arousal to in- 
crease identification with—and consequently, 
conformity to—the experimenter’s standards. 


Sex Interaction 


A reexamination of the assumptions underly- 
ing the derivations of our hypotheses indicates 
that conformity with the rules would be re- 
lated to the degree to which the subject 
identified with the experimenter. It is clear, 
post hoc, that this assumption may not have 
been correct, and that the data are more in 
line with predictions reasoned from a theo- 
retical scheme of the stages in the identifica- 
tion process as originally depicted by Freud 
(1933). According to this picture, the child 
of 4 would not yet have identified with the 
parent of the same sex but would be experi- 
encing increasing libidinal attachment toward 
the parent of the opposite sex. Identification 
with the same-sex parent should occur with 
the Oedipal resolution around 6 years of age. 
From such a theoretical position, one co 
derive the hypothesis that resistance to temp- 
tation in 4-year-olds would be greater with 
opposite-sex adults in order to please them. 
Our results are certainly more in line with 
such considerations. However, if we are deal- 
ing with pre-Oedipal-resolution children who 
have a greater cathexis for the opposite-sex 
parent, the predictions regarding conformity 
to the rules of our game would not be based 
on “internalization” of parental strictures. 
The issue of pleasing the no-longer-present 
experimenter in our test situation becomes 
ambiguous in that obtaining a good score 9 
please the experimenter is as likely as con- 
forming to his rules. Thus, to make precise 
differential predictions from either the pre 


—— 
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RESISTANCE TO TEMPTATION 


Oedipal-resolution or postresolution psycho- 
analytic conceptions is difficult. Data on 7- 
year-olds are now necessary for a more com- 
plete test of this theory since after the 
Oedipal resolution the significant interaction 
should be for greater compliance with the 
tules established by a same-sex experimenter. 

Other experiments investigating sex of 
experimenter and of child as independent 
variables demonstratéd their effect on per- 
formance and are relevant for our post hoc 
considerations, Stevenson (1961) found that 
a female experimenter was more effective than 
a male experimenter as a dispenser of social 
reinforcements to increase performance in 3-4 
year olds. But for ages older than 4, the 
results often indicate that reinforcements dis- 
pensed by an opposite-sex experimenter are 
more effective than those by an experimenter 
of the same sex as the subject (Gewirtz & 
Baer, 1958a, 1958b; Gewirtz, Baer, & Roth, 
1958; Stevenson, 1961). The results of these 
studies and of our experiment all conform to 
the hypothesis that young children are moti- 
vated to please an opposite sex adult more 
than an adult of the same sex. 

This interpretation of our results is based 
on the assumption that, in order to please an 
adult of the opposite sex, the child will con- 
form to the rules, rather than cheat to obtain 
a high score in order to please an adult of 
the same sex. In fact, however, each of these 
motives may be contributing to our findings 
if conforming behaviors are more associated 
with gaining love from the opposite-sex parent 
and achievement behaviors are more often 
instigated by the relationship with the same- 
Sex parent. It is also possible that the wish 
to please, though operating in our results, is 
different for boys and girls. In the case of 
boys, it is the desire to please a father figure 
by achieving: hence, by contrast, conformity 
would be associated with an opposite-sex ex- 
Perimenter. For girls, it is the wish to please 
a father figure by conforming compared with 
less conformity with a same-sex adult. Though 
less Parsimonious than a single factor model 
In positing a different basis in boys from that 
In girls, this interpretation—that both con- 
formity to restrictions of opposite-sex adults 
and achievement arousal by same-sex adults 
Produce an association of conformity with 
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having an opposite-sex experimenter—does 
seem reasonable for our results and would be 
consonant with the results of other research 
cited above and below. 


Withdrawal of Attention 


Manipulating the relationship of the child 
to the experimenter has also produced differ- 
ential performance in other studies. With- 
drawal of attention (Hartup, 1958; Rosen- 
blith, 1959, 1961) and complete isolation from 
social contacts (Gewirtz & Baer, 1958a, 
1958b; Gewirtz et al., 1958; Stevenson & 
Odom, 1962; Walters & Ray, 1960) have 
tended to increase performance, although 
these results are inconsistent in regard to 
whether withdrawal is more effective when 
the experimenter is the same or opposite sex 
as the subject. These results were interpreted 
as supporting dependency arousal (Hartup, 
1958; Rosenblith, 1959, 1961), social depriva- 
tion and drive (Gewirtz, 1958a, 1958b; 
Gewirtz et al., 1958; Stevenson & Odom, 
1962), and anxiety arousal (Walters & Ray, 
1960). 

If performance in these experiments is in- 
creased in order to please the experimenter, 
then the results just reviewed might suggest 
that withdrawal or isolation would increase 
this motive. Our data conform to this inter- 
pretation if one assumes that for boys a 
motive to please the experimenter by getting 
a high score is operating (Table 4). For the 
effect found is that in boys withdrawal in- 
creases cheating. If conformity to the rules of 
the game is increased in boys by wanting to 
please a female (opposite-sex) adult, then 
withdrawal seems to decrease this motive. 
Perhaps there is some feeling of being rejected 
which reduced any motive to need to con- 
form to the rules to please the experimenter, 
Were this the case, the motive to resist 
temptation would be decreased in the subject 
by withdrawal of attention, leaving as a 
greater influence on his behavior the motive 
to cheat in order to win a prize for himself. 
In the case of girls, there is a very slight 
difference in favor of withdrawal to increase 
conformity, but this is mainly due to the 
larger number of girls in the continuous at- 
tention cell to have had a female experimenter. 
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Arousal of Achievement 


In our post hoc discussion we have re- 
peatedly felt the need to consider the arousal 
of achievement in accounting for our findings. 
The temptation test used appears to be 
strongly loaded with components relevant for 
achievement motivation. This consideration 
was given support in the somewhat parallel 
study by Grinder (1960) which produced its 
clearest results when he analyzed the resist- 
ance to temptation data in such a way as to 
control for achievement motivation. If the 
degree of achievement motive aroused in the 
subject were increased by withdrawal of at- 
tention, and the amount of cheating were cor- 
related with this increase in motivation to 
obtain a high score, then the increase in 
cheating in boys under the withdrawal of at- 
tention is understandable. The lack of this 
effect on girls would be consistent with other 
failures to obtain increased achievement mo- 
tivation in girls (McClelland, Clark, Roby, 
& Atkinson, 1958). 

With such different interpretations pos- 
sible, it seems that no single explanation 
presently available can account for the re- 
sults of studies such as this. Clearly, addi- 
tional studies investigating the motives in- 
volved in temptation tests are required to 
assess the extent to which these and pos- 
sibly other motivational components deter- 
mine whether a person will cheat or not. 
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CHILDHOOD PREJUDICE AS A FUNCTION OF PARENTAL 
ETHNOCENTRISM, PUNITIVENESS, AND 
OUTGROUP CHARACTERISTICS 


RALPH EPSTEIN anp S. S. KOMORITA 
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In order to determine the conditions for which the scapegoat hypothesis may 
be valid, the effects of 4 independent variables (perceived parental ethnocentrism, 
punitiveness, racial, and socioeconomic characteristics of outgroups) upon 
social distance were investigated. The experimental situation consisted of 
presenting slides of a fictitious group; and in the various conditions, this group 
was depicted as white, Negro, or Oriental and middle or working class. The 
results, indicated that high parental ethnocentrism associated with moderate 
punitiveness is most conducive to the development of childhood ethnocentrism. 
The finding that working-class characterization elicited greater social distance 
towards Negro relative to other groups suggests the significance of stereotypes 


in the development of prejudice. 


The scapegoat hypothesis, a cornerstone of 
major conceptualizations regarding prejudice 
(Adorno, Frenkel-Brunswik, Levinson, & San- 
ford, 1950; Berkowitz, 1962) is based on the 
assumptions that severe discipline directed to- 
wards aggression may increase the instigation 
to aggress and anticipation of punishment for 
aggression directed towards the ingroup re- 
sults in displacement from the original sources 
of frustration to outgroups. Equivocal and 
contradictory results regarding this hypothe- 
sis (Masling, 1954; Miller & Bugelski, 1948; 
Mosher & Scodel, 1960; Stagner & Congdon, 
1955) indicate the utility of further research 
regarding those conditions for which the 
Scapegoat hypothesis may be valid. 

According to Allport (1954) and Zawadzki 
(1948), the predictive value of the scapegoat 
hypothesis has been limited by its focus on 
the motivational states of the prejudiced in- 
dividual and relative neglect of those out- 
group characteristics which facilitate their 
selection as targets for displaced hostility. 
Insofar as relatively few outgroup charac- 
teristics have been studied so far, for exam- 
ple, prior dislike (Berkowitz, 1962) and visi- 
bility (Williams, 1947), the major goal of 
this study was to investigate the develop- 
ment of social distance in children as a 
function of perceived parental punitiveness 
towards aggression, perceived parental ethno- 
centrism, and two major outgroup character- 
istics—social status and race. Previous re- 


search by the authors (Epstein & Komorita, 
1965), carried out on an upper middle-class 
child population, was based on the assumption 
that severe parental discipline may sensitize 
the child to power relations of strong-weak 
or superior-inferior, as these dimensions are 
culturally defined by social status. Assuming 
that the severely disciplined child may be ex- 
cessively sensitized to differential status rela- 
tionships, it was predicted that he would de- 
velop greater social distance towards low 
status rather than middle-class groups and 
towards Oriental rather than white children. 
Contrary to prediction, a nonmonotonic re- 
lationship between parental punitiveness and 
social distance was obtained: moderate puni- 
tiveness resulted in maximal social distance, 
and low socioeconomic status of the out- 
groups elicited greater social distance than 
their ethnic characteristics. 

In order to obtain further clarification of 
these provocative findings, the current study 
was carried out. The present study differs 
from the former in three important respects. 
Since a significant relationship between pa- 
rental punitiveness and children’s social dis- 
tance may be based on a third variable, 
namely, parental ethnocentrism, an attempt 
was made to obtain children’s reports of their 
parents’ ethnic attitudes. Thus, data were ob- 
tained in order to clarify the controversy be- 
tween psychopathological theories of preju- 
dice (Adorno et al., 1950) which focus on 
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parental discipline and social learning theo- 
ries (Buss, 1961), which emphasize identifi- 
cation with parental attitudes. Furthermore, 
insofar as religiosity among adults has ap- 
peared significantly related to prejudice (All- 
port & Kramer, 1946), it was decided to 
replicate the previous research on a parochial 
school population (Catholic). Finally, since 
the previous failure to find a significant re- 
lationship between social distance and the 
outgroup's racial characteristics may have 
been a function of the limited number of 
ethnic groups studied, a Negro condition was 
added to the white and Oriental groups. 


METHOD 
Subjects 


The sample consisted of 180 boys and girls who 
comprised the third through fifth grades at a Catho- 
lic parochial school in Detroit, Michigan. This school 
serves children whose socioeconomic background as 
determined by residential area is predominantly low 
middle class. 


Measure of Parental Punitiveness 


The Parental Punitiveness Scale (PPS) was de- 
veloped by the authors to measure children’s percep- 
tions of parental punitiveness towards aggression. A 
detailed description of the development of this scale 
is reported elsewhere (Epstein & Komorita, 1965a). 
Briefly, the scale consists of 45 items which measure 
parental punitiveness towards physical, verbal, and 
indirect aggression in each of five major situations: 
aggression towards parents, teachers, siblings, peers, 
and inanimate objects. The scale is scorable sepa- 
rately for father’s and mother’s responses to aggres- 
sion. However, since the correlation coefficient be- 
tween father and mother versions was found to be 
.60, the two scores were pooled to yield a single, av- 
erage punitiveness score. The split-half reliability of 
this average punitive score, with the Spearman- 
Brown correction, was .81. 


Experimental Conditions 


Three independent variables were used: race of 
target group—Negro, Oriental, and white; socioeco- 
nomic class of target group—lower versus middle 
class; and high, medium, and low groups on the 
basis of scores on the PPS. Thus, a 3 X 2X 3 fac- 
torial design was employed with 10 subjects in each 
of the 18 experimental conditions. 

The basic purpose of the experimental conditions 
was to create specific cognitions regarding a fictitious 
group, the “Piraneans.” Accordingly, the subjects 
were presented slides which depicted Piraneans as 
either middle or lower class, and Negro, Oriental, or 
white. Race of Piraneans was varied by presenting 
slides of four Negro, Oriental, or white children, two 
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boys and two girls each, who were representative of 
the subjects’ age range. Socioeconomic class was 
varied by presenting slides depicting residence and 
working place of Piraneans. For example, the work- 
ing-class slides depicted scenes of a ramshackled 
house, deteriorated slum streets, and street construc- 
tion, whereas the middle-class slides depicted a new 
split-level house, suburban streets, and modern office 
building. 

Prior to the group administration of the slides, 
the following instructions were given: 


There is a group of peóple whom most of you 
have never seen. As a matter of fact, you have 
probably never heard of this group. They are 
called Piraneans. Would you like to see some 
slides of the Piraneans? 


After viewing the slides, the subjects completed a : 
seven-item social distance scale with regard to Pi- 
raneans. These items ranged from, “Would you want 
to marry these people when you grow up?” (mini- 
mal) to, “Would you want these people to visit your 
country?” (maximal). Each item could be answered 
by checking one of four alternatives ranging from, 
"very much yes,” to “very much no." 

In order to minimize the potentially confounding 
factors of differential clarity and brightness, the 
slides were matched as closely as possible in terms 
of these variables. The low socioeconomic slides were 
based on scenes within Detroit slums whereas the 
middle socioeconomic slides were based on photos 
of suburban areas. Postexperimental interviews with 
a sample of subjects indicated that very few were 
able to state the specific locale of the slides although 
several subjects believed that the photos were taken 
within the United States. 

In order to determine the children’s general eth- 
nocentrism, ratings of the following groups were 
obtained after the experimental sessions: German, 
French, Catholic, Italian, Mexican, Negro, Japanese; 
Jewish, and Russian, Social distance scores for each 
of these groups were then pooled to obtain a meas- 
ure of generalized social distance. Three weeks later, 
the subjects were requested to indicate how they 
thought their parents would rate these same groups 
on the social distance scales, Thus, measures of the 
child’s and parents’ ethnocentrism, as perceived by 
the child, were obtained. 


RESULTS 


For the purpose of intergroup comparisons, 
Table 1 summarizes the means and standard 
deviations of the social distance scores for the 
18 experimental groups. j 

An analysis of variance of these scores m 
dicated that with regard to main effects, the 
effect of levels of parental punitiveness was 
not significant at the .05 level (F = 2.8 
df = 2/162). However, the main effects fot 
social class and race were significant at the 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS OF PIRANEAN SOCIAL DISTANCE SCORES 
FOR EXPERIMENTAL GROUPS 
" Working class Middle class 
Parental 
punitiveness MSD 
Negro Oriental White Negro Oriental White 
High 20.6 15.6 15.5 14.6 14.1 13.1 15.6 
(5.34) (3:81) (4.8) (4.98) (1.7) (2.81) 
Medium 1875 81:5 14.6 14.0 13.2 14.6 12.5 14.6 
(4.21) (4.92) (3.57) (3.67) (5.02) (2.73) 
Low 21.1 16.1 16.3 174 14.0 14.5 16.5 
í (4.76) (3.94) (4.93) (5.86) (2.65) (3.9) 
Mean social distance 
Working class 16.9 
Middle class 142 
Negro 17.5 
Oriental 14.8 
White 143 


^ Standard deviations appear in parentheses. 


:01 level (F = 8.95, df = 2/162; F = 16.99, 
dj = 1/162, respectively). These results indi- 
cate that the working-class condition elicited 
Breater social distance relative to the middle- 
class condition and “Negro” elicited more so- 
cial distance than “Oriental” or “white” con- 
ditions. 

The interaction between socioeconomic 
Status and race, significant at the .05 level 
(F= 3.28, df = 2/162), indicates that the 
lower class characterization elicited greater 
Social distance towards the Negro relative to 
the white (t = 5.73, p< .01) and Oriental 
(£2 5.46, p < .01) groups, that is, whereas 
middle-class status served to minimize dif- 
ferences in attitudes toward the ethnic groups, 
working-class status enhanced differential so- 
cial distance toward Negroes on the one hand, 
and white, Orientals, on the other. 

In order to further delineate the anteced- 
ents of social distance, the subjects’ Piranean 
Social distance scores for each experimental 
Condition were correlated with: the subjects’ 
ethnocentrism, measured by the subjects’ re- 
Ports of their own social distance attitudes 
towards the 10 nonfictional groups; perceived 
Parental prejudice, as measured by the sub- 
J€cts’ reports of parental attitudes toward 
"Negro and Oriental (summation of social dis- 
tance ratings of Chinese and Japanese); and 


general parental ethnocentrism, as measured 
by children's reports of parental attitudes to- 
wards the 10 nonfictional groups. These cor- 
relation coefficients are shown in Table 2. 
Table 2 indicates that social distance to- 
wards a fictitious group is significantly cor- 
related with the child's general level of eth- 


TABLE 2 


CORRELATIONS BETWEEN CHILDREN’S SOCIAL DISTANCE 
TOWARDS "PIRANEANS" AND CHILDREN'S AND 
PARENTAL PERCEIVED ETHNOCENTRISM 


Parental 
Children's prejudice Parental 
ethno- toward ethno- 
centrism “Negro” and centrism 
“Orientals” 
White 
Middle class | .41* (30) 09 (25) |—.02 (25) 
Lower class | .44* (30) | .68**(23) .25 (23) 
Oriental 
Middleclass| .27 (30) | .53**(26) -40* (26) 
Lower class | .69*(30) | .60%*(30) .42* (30) 
Negro 
Middle class | .47**(30) .65** (26) .36 (26) 
Lowerclass | .36* (30) .599*(27) 30 (27) 
Mean 
correlations? | .45**(150) | .61**(132) .35** (132) 


Note.—Numbers in parentheses denote sample size. Sample 
size for perceived parental prejudice vary for experimental 
groups because some subjects were absent for the second experi- 
mental session. 

a Excluding correlation coefficient for “white, middle-class” 
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nocentrism, and with perceived parental 
prejudice towards the specific ethnic groups 
depicted on the slides. However, for the cor- 
relations with perceived parental ethnocen- 
trism, only two of the six correlations are 
significant at the .05 level. On the other hand, 
the average correlation, weighted by sample 
size and pooled over five of the six experi- 
mental groups is .35, and 132 df's. This is 
significant at the .01 level. These average 
correlations do not include the data for the 
white, middle-class Piranean condition since 
there is no theoretical rationale for assuming 
that parental ethnocentrism would correlate 
with social distance towards this group. The 
striking result of the data of Table 2 is that, 
except for the one correlation of —.02, the 
correlations are consistently positive as well 
as moderately high. Thus, a moderate pro- 
portion of the variance of children's social 
distance toward the fictitious group can be 
accounted for by their perception of parental 
prejudice. 

With regard to children's general ethno- 
centrism as a dependent variable, the corre- 
lation between children's and perceived pa- 
rental ethnocentrism was .48, with 155 df's. 
This is significant at the .01 level. The cor- 
relation between children's ethnocentrism and 
perceived parental punitiveness on the other 
hand was only .02. However, there were rea- 
sons to believe that this correlation might be 
nonlinear and that there might be an inter- 
action between the effects of parental puni- 
tiveness and parental ethnocentrism on chil- 
dren's ethnocentrism. Accordingly, the data 
was cast into a 3 X 3 factorial design with 
three levels of parental punitiveness and pa- 
rental ethnocentrism. For the 157 subjects 
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Fic. 1. Children’s ethnocentrism as a function of 
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for whom complete data were available, 13 
subjects were randomly eliminated in order 
to form equal cell frequencies. This resulted 
in 16 subjects in each of the nine experi- 
mental conditions for a total of 144 subjects. ' 
The analysis of variance resulted in a signifi- 
cant main effect of parental ethnocentrism 
(F = 18.89, df = 2/135, p< .01); this re- 
sult simply reflects the positive relationship 
between parental ethnocentrism and children's 
ethnocentrism found in the previous analysis. 
The main effect of parental punitiveness was 
not significant at the .05 level; however, the 
interaction. between the two variables was 
significant at the .05 level (F = 2.92, df = ' 
4/135), and the nature of this interaction is 
depicted in Figure 1. 

Application of Duncan's multiple range 
test (Edwards, 1960) to the data in Figure 1 
indicated that if perceived parental ethno- 
centrism is low or moderate, children's eth- 
nocentrism is independent of parental puni- 
tiveness. On the other hand, if a child per- 
ceives his parents to be highly ethnocentric, 
the effects of parental punitiveness are sig- 
nificantly nonmonotonic. Thus, the data sug- 
gest that high parental ethnocentrism asso- 
ciated with moderate punitiveness is most 
conducive to the development of ethnocen- 
trism in children. It should be noted that the 
same functional relationship was obtained in 
a previous study on a different population of 
children (Epstein & Komorita, 1965b). 


Discussion 


This research suggests that childhood eth- 
nocentrism is related to an interaction be- 
tween parental ethnocentrism and punitive- 
ness. The striking relationship between pa- 
rental ethnocentrism and children's social 
distance attitudes towards fictional as well 
as nonfictional groups is consistent with a 
social learning theory which focuses upon 
parental reward and approval for successful 
imitation as the basis for childhood preju- 
dice (Buss, 1961). This interpretation is con- 
sonant with recent investigations which have 
reported that, for specific populations, chil- 
dren's prejudices may directly reflect paren- 
tal ethnocentrism. For example, Anisfeld; 
Munoz, and Lambert (1963) reported that 
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ethnocentrism among Jewish children is re- 
lated to identification with parental beliefs. 
Mosher and Scodel (1960) demonstrated that 


. among middle-class, Protestant children, so- 


cial distance attitudes were related to ma- 
ternal ethnocentrism, but failed to correlate 
with authoritarian child-rearing practices. 
Unlike these investigations, however, the 
current study suggests the nature of those 
conditions which facilitate the imitation of 
parental ethnocentrism, namely, moderately 
punitive discipline, The finding that moder- 
ate discipline is more effective than weak or 


„severe discipline in the transmission of pa- 


rental attitudes to the child appears consist- 
ent with recent theory and research regarding 
the antecedents of identification (Sears, Mac- 
coby, & Levin, 1957; Whiting & Child, 1953). 
According to these investigators, "love-ori- 
ented" discipline, which lies midway between 
permissiveness and punitiveness, is most 
likely to lead to behavioral similarity be- 
tween parent and child. Thus, it may be as- 
sumed that a major consequence of moderate 
discipline is to orient the child towards ob- 
taining parental approval and to create some 
doubt regarding its achievement. The child 
may seek to reduce this doubt by internaliz- 
ing parental attitudes and values. On the 
other hand, the avoidant tendencies and ex- 
cessive autonomy elicited by high punitive- 
Dess or permissiveness, respectively, are likely 
to inhibit the process of identification. 

These results are compatible with the re- 
Sults of our previous study on middle-class 
children which indicated that moderate disci- 
pline is related to childhood ethnocentrism 
(Epstein & Komorita, 1965b). However, in- 
Sofar as the measurement of parental ethno- 
centrism was not possible, the interpretation 
of the previous results was necessarily am- 
biguous. Furthermore, our findings suggest 
that the basic assumption inherent in the 
Scapegoat hypothesis of a monotonic relation- 
ship between parental punitiveness and dis- 
Placed aggression requires modification. It 
would seem that given a population, such as 
the one employed in this study, in which a 
fairly high level of ethnocentrism is present, 
it is probable that training conditions con- 
ducive to parental identification, that is, 
Moderate discipline, are likely to constitute 
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an important antecedent of prejudice in chil- 
dren. Moreover, an important outcome of this 
study is the finding that in addition to pa- 
rental attitudes and behaviors, the perceived 
characteristics of outgroups (race and socio- 
economic status) influence children’s social 
distance attitudes. Whereas both Negro and 
white groups elicited greater social dis- 
tance when associated with working- rela- 
tive to middle-class status, working-class 
status greatly accentuated prejudice towards 
the Negro as compared with the white group. 
However, middle-class status served to mini- 
mize differential attitudes as a function of 
ethnic affiliation. 

The assumption that social distance to- 
wards working-class Negroes may be pri- 
marily a function of stereotypes which elicit 
emotional and evaluative responses towards 
this group is supported by Katz and Braly’s 
(1958) report that preferences for foreign 
groups are determined by the desirability of 
characteristics previously attributed to these 
groups, and Negro elicits highly consistent 
and negative stereotypes compared with other 
groups. Our findings that prejudice towards 
the Oriental group is not differentially af- 
fected by social status suggests that cultural 
stereotypes towards Orientals are both less 
crystallized and less negative as compared 
with stereotypes regarding Negroes (Katz & 
Braly, 1958). Stereotypes may contribute to 
social distance by reducing dissonance asso- 
ciated with rejection of outgroups with whom 
there has been minimal contact (Festinger, 
1957) or facilitating the attribution of dis- 
similar beliefs and values to outgroups, 
thereby contributing to the “belief preju- 
dice” (Rokeach, 1960). 

This study corroborates previous research 
by the authors (Epstein & Komorita, 1965a, 
1965b) which suggests that the socioeconomic 
status of the outgroups is a basic variable re- 
lated to the development of both prejudicial 
attitudes in children and aggressive acts 
among adults. Insofar as inferior social status 
appears to have the most detrimental effects 
on social distance towards Negroes, increas- 
ing opportunities for social mobility may 
modify hostile attitudes towards Negroes to 
a significantly greater degree relative to other 
ethnic groups. 
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In conclusion, this study points to the 
utility of investigating childhood prejudice as 
a joint function of child-rearing attitudes and 
practices, and stimulus characteristics of out- 
groups. 
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Responses indicating states of affect were made by 156 Ss to 3-person situations 
in which 3 of the 6 possible attitudes were known, all 8 possible combinations 
of liking and disliking being systematically varied. Only in the 4 situations 
where attitudes of S's well-liked friend were known to S were responses of 
pleasantness or unpleasantness (hypothetically corresponding to balance and 
imbalance) as predicted by Heider-like theories of balance. These discrepancies 
are examined in the light of forces presumably aroused by uncertainty, by 
indifference, and by assumptions concerning reciprocation of favorable at- 
titudes. A distinction among balanced, nonbalanced, and imbalanced systems 


seems necessary. 


This report raises problems concerning 
contemporary versions of balance theory 
(Brown, 1962; Heider, 1958; Newcomb, 
1959) and proposes some concepts that may 
contribute to the understanding of otherwise 
troublesome findings. Balance theories assume 
that when systems are put into states of 
disequilibrium, forces arise to restore equilib- 
rium or balance. The most unambiguous in- 
stance of imbalance, in interpersonal systems, 
occurs when the attitudinal relationship of 
Person P toward another person, O, is positive 
(e.g. liking rather than disliking) and when 
there is a discrepancy between P’s attitude 
toward object, X (or, alternatively, a third 
person, Q) and what P perceives O’s attitude 
toward X or Q to be. Previous research has 
‘shown that when P’s attraction to O is posi- 
tive, imbalance arising from discrepant at- 
titudes toward X or Q does lead to feelings 
of uneasiness (Jordan, 1953; Price, 1961). 
Under conditions of negative attitudes from 
P to O, however, we have found affective 
Tesponses not predicted by present versions 
of balance theory. 

. Several critics have noted that some limit- 
Ing conditions need to be attached to existing 
Statements of balance theory (cf. Katz & 
Stotland, 1959; Zajonc, 1960). The present 
Paper is offered as one step in that direction. 


"Special gratitude is expressed to Deon Price for 
her assistance in clarifying and editing. 


METHOD 


The present population consists of an original 
study (N = 41) and three replications over a period 
of 2 years (N —28, 31, and 56) making a total of 
156 subjects, all of whom were introductory psy- 
chology students (40% male) at the University of 
Michigan. The procedures, almost identical in all 
administrations, are more fully described in Price 
(1961) and in Price, Harburg, and McLeod (1965). 
Subjects were presented with a paper-and-pencil 
instrument and asked to select their two best 
friends (designated hereafter as 1 and 2), and two 
persons whom they strongly disliked (designated 
hereafter as 7 and 8). These persons were to be 
peers of the same sex as the subject, and were to 
be fellow students on the campus in order to insure 
that the hypothetical situations might plausibly 
have happened. Subjects also indicated on a 10-point 
scale how much they liked 1 and 2 and how much 
they disliked 7 and 8. After selecting these persons, 
each subject was asked to complete eight hypo- 
thetical situations (labeled A-H) in which the sub- 
ject was to insert the initials of the appropriate per- 
sons in the spaces left for that purpose. Then the 
subject was to indicate by a check mark on the 
line below each situation how the situation made 
him feel. This line, following directly under each 
paper-and-pencil situation, was 90 millimeters long 
and extended from “uneasy” through a neutral point 
to "pleasant." It is a modified version of Jordan's 
(1953) scale. An example of one of these situations 
is as follows (see Table 1, Situation D): 


I strongly like (1); I strongly dislike (7). 
I see (1) strongly likes (T). 

I feel: 
uneasy 


pleasant 


These situations were always presented to the sub- 
ject in the same order of relations: P (the subject) 


265 


266 


K. O. Price, E. Harsurc, AND T. M. NEWCOMB 


TABLE 1 


PERCENTAGE OF SUBJECTS RESPONDING WITH VARYING SIGNS OF AFFECT TO 
Eicut Basic Srrvations (N = 156) 


Sign of affect 
Significance of 
Type of situation pleasant-uneasy 
e N + distribution® 
(Uneasy) (Neutral) (Pleasant) 
Part I: Positive 
P-to-0 
Ds 
A^ P—>0 6% 7% 87% p <.001 
+N + 
Q 
+ 
Bb P—>0 5 6 89 p < 001 
E 
Q 
En 
C Ep 0 89 0 11 p < 001 
TRAS 
9 
DIIEpE 9 84 8 8 p «001 
md * 
Part II: Negative 
P-to-O 
NG) 65% 15% 22% p «001 
tN C+ 
Fl) pc 17 38 45 p € 001 
BS 
œ  P—>0 43 2 35 p «50 
+N - 
H P—>0 28 39 33 p «15 
VISUS 


^ Computed by chi-square; see text for procedures employed. 


» Balanced situations, according to Heider's formula, all others being imbalanced. 


strongly likes or dislikes O; P strongly likes or dis- 
likes Q; O strongly likes or dislikes Q. The P-to-O 
and the P-to-Q relationships are established by the 
subjects, while the O-to-Q relationship was given 
by the experimenter, 

Subjects’ responses were categorized in terms of 
percentages who checked more than 2 millimeters to 
the left of the neutral point (uneasy), who checked 
within 2 millimeters to the left or right of the mid- 
point (neutral), and who checked more than 2 milli- 
meters to the right of the neutral point (pleasant). 
Our intent was simply to score the presence or ab- 
sence of affect and its direction. 

Our decision to test propositions concerning psy- 
chological balance and imbalance through responses 
indicating affective states came after extensive pre- 
testing in which we found that, if people felt pleas- 


ant in a hypothetical situation, they would also, if 
given the same situation without the completed 
O-to-Q relationship, almost invariably fill in that 
relationship with liking or disliking in ways pre- 
dicted by the theory. Even though we attempted to 
make the situations as lifelike as possible by using 
the initials of actual acquaintances, we are testing 
only whether the subjects’ affect in hypothetical 
situations can be predicted from balance theories. 


RESULTS 


Responses are tabulated in Table 1 for the 
eight basic situations, each of which involves 
a third person (Q), rather than a nonperson 
(X). As shown by the directions of the at- 
rows appearing in the eight situations, sub- 
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jects were given no information about the 
Q-to-O relationship. 

In all eight situations, the distributions of 
responses among the three categories differ 
from chance expectation at significance levels 
beyond .001, by chi-square. (Since we know 
of no reason to assume that, across all situa- 
tions, the distribution of responses would be 
bell-shaped rather than rectangular, we have 
chosen the latter, and simpler, definition of 
expected frequencies, which thus become 
43/90, 4/90, and 43/90, respectively, of re- 
sponses by all 156 subjects. If only the un- 
easy-pleasant dichotomy is considered, dis- 
tributions differed from chance expectations 
at the same level except for the last two situa- 
tions, G and H. (Expected frequencies were 
taken as half “uneasy” and half *pleasant.") 

In all situations in Part I of this table, P 
likes Q, and the results are as predicted (by 
Heider and Newcomb, among others). In Part 
II of the table are situations in which P dis- 
likes O, but here the results are less pre- 
dictable by any theory. The two most gen- 
eral characteristics, of responses in Part Il, 
as compared with Part I, are the greater num- 
ber of neutral responses and the relative few- 
ness of pleasant responses. In only one of 
these situations (F, where all three relation- 
ships are negative) do pleasant responses sig- 
nificantly exceed uneasy ones. 


Discussion 


Hypothetical Determinants of Situational 
Differences 


* Two general features of the empirical find- 
ings constitute our theoretical problem. 

1. Current versions of balance theory pre- 
dict neatly to findings when P-to-O is posi- 
tive; in each of the first four situations more 
than 80% of responses are as predicted, and 
neutral responses do not significantly exceed 
expected frequencies. When P-to-O is nega- 
tive, however, responses to only one of four 
situations (E) are as predicted by Heider’s 
formula, according to which sets of three re- 
lationships are in balance if none or two of 
them are negative, and otherwise imbalanced 
(Heider, 1958, p. 202). 

2. Responses to situations that are identi- 

cal—except that P’s relationships to O and 
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to Q are reversed—are by no means the same, 
in spite of the fact that O (in the one situa- 
tion) and Q (in the other situation) are per- 
sons toward whom P has the same attitude. 
The two possible comparisons of this kind are 
B with G, and D with E; in the former com- 
parison the direction of preponderant response 
is actually reversed when the positions of the 
signs are reversed. In sum, responses to three 
of the eight situations are not as predicted by 
Heider’s formula; all of these three involve 
negative P-to-O attitudes, while situations in- 
volving negative P-to-Q attitudes are not im- 
balanced except when they include negative 
P-to-Q attitudes. 

As compared with POX systems, in which 
X is a nonhuman entity, our three-person sys- 
tems appear to have two principal character- 
istics, P’s response is likely to be affected by 
his assumptions of reciprocated attitudes on the 
part of O and Q toward himself and from Q 
to O. And, since he has been given no infor- 
mation about these attitudes, P’s assessment 
of the situation is likely to be affected by his 
uncertainty—perhaps stemming from its rela- 
tive complexity—as well as lack of informa- 
tion about reciprocal attitudes. We shall there- 
fore examine some of the phenomena asso- 
ciated with assumptions about reciprocation 
and with uncertainty. 

Uncertainty about Q-to-O may be a reason 
why the sign of the P-to-O relationship but 
not of P-to-Q has such noticeable effects on 
subjects’ responses. The systematic difference 
(as experimentally provided) between O and 
Q is that O-to-Q is known, whereas none of 
Q's attitudes are known. Thus Q is not only 
a source of uncertainty, but also (as con- 
trasted with a nonperson X) a possible source 
of attitudes that could render the total situ- 
ation an "unpleasant" one. P's relative un- 
certainty concerning O's response to P him- 
self, on the other hand, is mitigated by two 
considerations: he knows what O-to-Q is, and 
he assumes, when P-to-O is positive, that O 
reciprocates his own liking. (Empirical data, 
our own included, indicate that such assump- 
tions are almost universal. Each subject— 
N — 156—was asked to report what his best- 
liked friend felt about him: 98% reported 
that they were liked in return, 0% that their 
best friend disliked them, and 2% did not 
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know.) The assumed—and welcome—reci- 
procity of liking between P and O provides 
a solid and unambiguous structure for assess- 
ing the entire situation, so that neutral re- 
sponses are few. The solidity in Situations 
A-D, stemming from O-to-P attitudes that 
are both *known" and welcome, appears ade- 
quate to overcome the effects of any uncer- 
tainty that P may have about Q. 

When P-to-O is negative, however, assump- 
tions of reciprocity are not dependably made. 
(When subjects were asked to indicate 
whether the disliked person whom he chose 
liked or disliked or was neutral to the sub- 
ject in return, 27% judged that their own 
dislike of O was reciprocated, 26% reported 
that O-to-P was positive, and 47% were 
unsure.) In Situations E-H, therefore, P was 
typically unsure about attitudes toward him- 
self on the part of O, the one person in the 
situation about whom he had some informa- 
tion. Hence the solidity provided by Situ- 
ations A-E was missing. 

Other evidence also suggests that, parallel- 
ing uncertainty, there is a good deal of ambi- 
valence toward persons whom our subjects 
disliked, compared with those whom they 
liked. We found, for example, that on a 10- 
point scale of liking, the mean rating of a 
"best friend" was 8.3, with a standard devi- 
ation of 0.7. The mass rating of a "strongly 
disliked person," however, on a 10-point scale 
of dislike, was only 5.5, with a standard devi- 
ation of 2.7. Though we did not ask respond- 
ents to indicate both degrees of liking and 
disliking for the same person, we assume 
that—especially in the case of negative re- 
sponses—both positive and negative compo- 
nents are commonly present in the same P's 
attitudes toward the same O. The frequency 
of neutral responses in Situations E, F, G, 
and H also suggest ambivalence; all four of 
these frequencies exceed chance expectations 
beyond the .001 level. 

Partly as a consequence of uncertainty, 
ambivalence, and assumed reciprocation, the 
eight situations also differ with respect to P's 
engagement in the triadic system, in the sense 
that the giving of positive attraction involves 
concern, together with susceptibility to recip- 
rocation or rebuff as well as to forces toward 

balance. Engagement corresponds, loosely at 
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least to a unit relationship (in Heider's 
sense) on the part of P toward O or Q. De- 
gree and quality of engagement may be cate- 
gorized as follows. 

1. There is little or no engagement if, as in 
Situations F and H, both O and Q are dis- 
liked; here neutral responses are shown to 
be conspicuously high. Several subjects wrote 
in comments like “I dislike both of them, and 
couldn't care less what they think." In each 
of four separate studies we have found neu- 
tral responses to these situations to be very 
frequent, and “pleasant” responses more so 


than “uneasy” ones. We have other data 


showing that when O was only slightly dis- 
liked (defined as a *passerby"), the reports 
of neutrality decreased and those of pleasant- 
ness increased. In similar vein Davol (1959) 
reports a far higher proportion of “triple 
negative” sociometriec triads than would be 
predicted by chance or by current versions 
of balance theory. Perhaps, in the context of 
other situations involving engagement with 
conflict, the mere absence of engagement is 
in itself welcome to many respondents. 

2. P is engaged, but with a good deal of un- 
certainty, in Situations E and G, because his 
only positive attraction is toward Q, about 
whose attitudes he knows nothing. In these 
situations we find both uncertainty and un- 
easiness relatively high. In either situation 
an “unknown” Q might, by “teaming up” 
with O whom P dislikes, and perhaps by 
failing to reciprocate P’s high regard, have 
the effect of isolating P. 

3. In the first four situations P may be 
described as engaged but with little or n0 
uncertainty, O's attitudes being known Ot 
inferred with a good deal of confidence. These 
situations may be briefly described as follows: 
(a) engagement without conflict or threat 
from O or Q; (5) engagement only with 0, 
who presents no conflict or threat; (c) &- 
gagement with O conflicts with engagement 
with Q; (d) engagement only with O, whose 
disagreement about Q presents conflict. 

Under these conditions of relatively certam 
engagement, whether or not involving C0- 
flict, forces toward psychological balance see 
to operate freely. In the absence of engage 
ment, indifference prevails over considera 


tions of balance, while uncertain engagement 


f 
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is associated with conflict in ways that seem 
unrelated to balance. 


General. Considerations 


We shall now pursue our examination of 
the situations in which P dislikes an O whose 
attitude toward Q is known, in hopes of better 
understanding the general conditions under 
which Heider's formula doés and does not 
predict to empirical fihdings. Beyond this, an 
understanding of these conditions may con- 
tribute to the general theory of psychological 
balance, Psychologically, imbalance involves 
either a state of conflict (the preference for 
a consistent, neatly ordered world versus the 
confrontation of a less consistent reality) or 
anxiety over the possibility of such conflict, 
or both. In the case of human triads, our evi- 
dence suggests that the component of uncer- 
tainty is not very prominent provided that P 
is strongly attracted toward an O whose atti- 
tude toward Q is known. Under these condi- 
tions P's feeling-state is readily predictable 
from the known relationships of P-to-O, in 
ways outlined by Heider. Our evidence also 
suggests that the Jack of uncertainty under 
these conditions stems from P's highly prob- 
able tendency to assume that O reciprocates 
his own favorable attitude. Given recipro- 
cated high attraction from an O whose atti- 
tudes toward Q are known, demands for con- 
sistency provide an adequate basis for pre- 
dicting P’s feelings about the total situation. 
Thus Heider’s formula, presumably based on 
such demands alone, finds empirical support 
when P-to-O is positive. 

When P-to-O is negative, however, parame- 
ters other than those toward consistency arise. 
We have found it necessary to hypothesize 
that these include the following. 

1. The necessity of coping with uncertainty 
about the reciprocated attitudes of O and Q, 
especially when attitudes toward them are 
negative. 

2. When P is faced with uncertainty, his 
existing tendency towards ambivalence is ex- 
acerbated, so that feelings of liking for a dis- 
liked person and of dislike for one who is 
liked (feelings of which he may normally be 
only dimly aware) reach the threshold of 
clear awareness. It is even possible that for 
some subjects the previously less recognized 
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attitude becomes dominant over its oppo- 
site, when they are confronted with such 
uncertainty. 

3. Effects of the engagement that follows 
the giving of favorable interpersonal attitudes 
have interaction effects with uncertainty, and 
the total absence of engagement is associated 
with indifference. 

We have called upon such hypothetical 
effects because they seem necessary to ac- 
count for our findings. They may all be re- 
lated, directly or indirectly, to forces toward 
consistency, and (in degree if not in kind) 
they are all unique to POQ situations. Thus 
reciprocity involves consistency between the 
attitudes of two persons toward each other. 
Uncertainty, in part at least, concerns the 
possibility that consistency may be violated. 
Ambivalence has to do with the coexistence of 
opposites as a form of inconsistency. Engage- 
ment involves thresholds for considerations 
of inconsistency and conflict. 

An integrated theory of interpersonal bal- 
ance would not call upon certain parameters 
only under special conditions: when, for ex- 
ample, the joint object of P's and O's in- 
terest is a third person but not when it is a 
nonhuman entity; or when P-to-O is negative 
but not when it is positive. Any problem of 
attitudinal consistency or of interpersonal bal- 
ance would, ideally, take into account the 
possibility that forces like those we have 
called upon are at work, though perhaps mini- 
mally. If it seems unnecessary to postulate 
any effects of reciprocity, uncertainty, ambi- 
valence, and engagement in POX situations, 
or in POQ situations where P-to-O is positive, 
we suspect that this is either because their 
values are at or near the zero point under 
these conditions, or because they "happen" 
to operate in ways predicted by balance the- 
ory. The condition of negative P-to-O, in 
short, differs from the other conditions not 
in principle, but because such parameters as 
those that we have here called upon are com- 
monly not at or near the zero point, or be- 
cause they operate in opposition to forces to- 
ward balance. If so, then the general state- 
ment of conditions under which balance the- 
ory applies must include such parameters. 

These, of course, are not the only parame- 
ters to be considered, but merely those sug- 
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gested by one set of data. Our basic point is 
simply that theories of balance and consis- 
tency, like any others, need to be, as nearly 
as possible, whole theories. This requires that 
we bring to bear all possible psychological 
sophistication concerning such fundamental 
matters as the processing of information, 
interpersonal needs, and cognitive structuring. 
Forces toward balance and consistency are 
themselves outcomes of such psychological 
processes in situations that present an indi- 
vidual with multiple confrontations. 

These hypothetical parameters, finally, 
seem to us to correspond to psychological 
processes that are so basic to processes of 
seeking balance and avoiding imbalance that 
no simple formula will account for the latter. 
Our data force us to conclude either that it is 
the placement rather than the number of plus 
and minus signs that determines balance, or, 
that “pleasant” and “uneasy” states of affect 
do not necessarily correspond to balanced and 
imbalanced states of interpersonal systems. 
Balanced states, like any other concept, may, 
of course, be defined in any arbitrary way. 
Our own preference is for a definition that 
maintains Heider’s original contention that 
imbalance is drive-inducing, and tends to re- 
lease forces toward the restoration of bal- 
ance. According to this position, psychological 
balance must be defined not by a formula 
based on numbers of plus and minus signs, 
but in terms of patterns of system properties. 

One possible approach (cf. Harary, Nor- 
man, & Cartwright, 1965; Newcomb, Turner, 
& Converse, 1965) is to distinguish among 
balanced states (toward which psychological 
forces are mobilized), nonbalanced states (of 
relative indifference), and imbalanced states 
(away from which psychological forces are 
mobilized). The first and third of these, as 
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our data seem to show, can be unambiguously 
shown in three-person systems only when at- 
traction is positive toward a person at least 
some of whose attitudes are known. 
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FATHER CHARACTERISTICS AND SEX TYPING* 


JULES M. GREENSTEIN 
Stern College for Women 


In an attempt to test 2 identification theories of sex typing, frequency of 
overt homosexuality and test measures of latent homosexuality, feminine 
identification, and masculinity-femininity were obtained from 25 father-absent 
and 50 father-present male adolescent delinquents. The fathers of the father- 
present Ss ;were rated for degree of dominance and closeness to their sons. 
Results indicated no significant differences between the father-absent and 
father-present Ss and no significant correlations between father dominance 
and the sex-typing measures. Contrary to the developmental identification 
hypothesis, a small but significant correlation was found between degree of 
father closeness and frequency of overt homosexuality. Results were con- 
sidered to be more consistent with a differential reinforcement theory of 
sex typing than with identification theory. 


The development of sex-typical or sex- 
atypical behavior has usually been attributed 
to some process of identification or imitation 
learning. In the male child, it is often assumed 
that both the development of culturally nor- 
mative sex-appropriate social behavior (mas- 
culinity) and the development of sex-appro- 
priate biological behavior (heterosexuality) 
are contingent upon some characteristics of 
the child’s father. One prominent variant of 
Freudian identification theory labeled defen- 
sive identification by Mowrer (1950) and ag- 
gressive identification by Bronfenbrenner 
(1960) emphasizes the fear-evoking aspects 
of the father’s role as the agent which leads 
the male child to identify with the aggressor. 
Another version of identification theory, the 
developmental identification hypothesis at- 
tributes appropriate sex typing in males to 
the nurturant and hence secondary reinforcing 
aspects of the father’s behavior (Mowrer, 
1950). A third approach, the social power 
theory (Parsons, 1955), considers identifica- 
tion to be a function of the relative power of 
the father as a controller of resources. 

Common to all identification theories are 
the following assumptions: the male child ini- 


1This paper is based on a dissertation submitted 
to the Graduate School of Rutgers—the State Uni- 
versity in partial fulfillment of the requirements for 
the PhD degree. The author wishes to express his 
appreciation to Nelson G. Hanawalt, under whose 
Supervision the investigation was conducted, and to 
the staff of the New Jersey State Diagnostic Center 
for their participation in the research. 


tially identifies with the mother; through a 
modeling process, certain specified role char- 
acteristics of the father, such as his relative 
power or nurturance, facilitate or inhibit a 
shift in identification to the father; and the 
resulting father identification promotes ap- 
propriate sex typing such as masculine in- 
terests and attitudes and heterosexuality. 
Bronfenbrenner (1958) has pointed out that 
many studies of sex-role identification have 
left it unclear whether the identification re- 
fers to the learning process, the real or per- 
ceived similarity to a parent, or to an end 
product such as masculinity or heterosexual- 
ity. Nevertheless, considerable data have been 
collected which suggest that one or another 
aspect of the end product, sex typing, is re- 
lated to such model characteristics as father 
nurturance, father power, or some combina- 
tion of these consistent with major identifi- 
cation theory models (Bandura, Ross, & 
Ross, 1963; Mussen, 1961; Mussen & Distler, 
1959, 1960; Payne & Mussen, 1956; Sears, 
1953). 

The role of the father is so central to iden- 
tification theories of sex typing that a corol- 
lary assumption is often made that where the 
father is absent, the male child remains 
mother-identified and may be likely to de- 
velop feminine sex-social characteristics and 
latent, or even overt, homosexual tendencies 
(Fenichel, 1945, p. 95). Studies of the effects 
of father absence (Bach, 1946; Leichty, 
1960; Lynn & Sawrey, 1959; Sears, Pintler, 


271 


272 


& Sears, 1946) have suggested the develop- 
ment of inappropriate sex typing, but these 
studies have been limited to young children 
or to the effects of very short periods of father 
absence. The effects of prolonged father ab- 
sence on sex typing in the adolescent or adult 
have not been systematically determined. 
Practical problems in securing data have also 
limited research aimed at testing hypotheses 
as to the effects of various father characteris- 
tics on sex typing. Such father characteristics 
as closeness or relative dominance in the 
family have been measured only indirectly by 
the mother's report (Sears, 1953; Sears, Mac- 
coby, & Levin, 1957) or by the child's percep- 
tion of the father (Mussen, 1961; Mussen & 
Distler, 1959, 1960). Measures of sex typ- 
ing have been limited to choice of role in 
doll play or to responses to a masculinity- 
femininity inventory and have not been ex- 
tended to the dimension of homosexuality- 
heterosexuality. 

The present paper reports the results of an 
investigation which attempted to overcome 
the limitations of prior research by trying to: 
assess the effects of prolonged father absence 
on male adolescents; measure degree of father- 
closeness and relative decision-making domi- 
nance by direct observation rather than by 
indirect report; and evaluate the effects of 
absence, closeness, and dominance on the 
latent and overt homosexuality dimensions of 
sex typing as well as on the more conven- 
tional masculinity-femininity measures. 


METHOD 
Subjects 


The subjects were 75 delinquent boys ranging in 
age from 13 to 18 residing temporarily at the New 
Jersey State Diagnostic Center. Criteria for inclu- 
sion in the father-absent (FA) group required that 
the subject had spent at least 3 years of life prior 
to age 12 in a home where no male adult resided. 
Inclusion in the father-present (FP) group required 
that the subject had spent his entire life in a home 
consisting of both natural parents or substitute par- 
ents acquired during the first year of life. Sampling 
was on the basis of consecutive admissions meeting 
the criteria. 

FA group. The FA group consisted of 25 subjects 
with a mean age of 15.5, mean IQ of 98.0, and 
median length of father absence of 8.2 years. All had 
lost their fathers as a result of desertion, separation, 
or divorce. 
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FP group. The FP group consisted of 50 subjects 
with a mean age of 15.3 and a mean IQ of 96.6. 
There were no significant differences between the FP 
and FA groups in age, IQ, or birth order. Although 


the FP group may have been higher in socioeconomic , 


status because of the presence of a male breadwinner, 
both groups were from the lower income levels and 
the mothers were of identical educational levels. 


Ratings of Father Characteristics 


Both parents of each subject in the FP group were 
interviewed frequently by a psychiatric social worker 
who rated the father on 6-point rating scales for 
dominance and closeness. 

Father-dominance rating. In forming their ratings 
of relative decision-making dominance, the inter- 


viewing social workers were instructed to consider: » 


Who appears to take charge in asking questions, 
making demands, or relating events? 

Who seems to dominate when mother and father 
have minor disagreements? 

Who seems to be more confident in making 
minor decisions, signing permits, etc.? j 

Who takes the initiative in starting and termi- 
nating conversations? 

Does one tend to silence or belittle the other in 
his or her presence? 

What does each parent say about the other? 

Does one appear to be frightened of the other? 


In order to remain as close as possible to the mean- 
ing of dominance at the level of clinical observation, 
these questions were to be used as guidelines in form- 
ing a single global rating of father dominance on a 
6-point scale ranging from “Father is very clearly 
and strongly the dominant partner; mother has very 
little say in things” to “Mother is very clearly and 
strongly the dominant partner; father has very little 
say in things.” 

Father-closeness rating. A similar procedure was 
used for rating father closeness. The interviewing 50- 
cial workers were instructed to consider: 


What does the father say about the son? 3 

How familiar is he with those events in the sons 
life that a father should know about? 

How concerned is he with his son's adjustment? 

Does he spontaneously go out of his way to 
make things easier for his son? > 

Does he show ease or discomfort in talking to 
his son? 

Do they have things to talk about or does the 
father devote his time to giving one-way sermons? 

Does the father really seem to want to gain iM- 
sight into his son’s behavior? 


The final rating was again on a 6-point scale rang- 
ing from “Father is obviously very warm towar d 
involved with, and emotionally close to his son" to 
*Father is obviously cold towards, uninvolved with, 
and emotionally distant from his son.” 

Reliability of ratings. Since only the social worker 
routinely assigned to each subject had direct conta 
with the parents, no direct measure of rating 1€ 
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ability was obtainable. An effort was made to secure 
an indirect estimate of rating reliability by the in- 
vestigator's interviewing each subject and question- 
ing him about his parents, their characteristics, and 
his relationship to them. From the manner in which 
the parents were described, ratings of dominance and 
closeness were obtained which were then correlated 
with the social worker ratings. Although this pro- 
cedure confounded reliability and validity of the 
ratings, it allowed for an estimate of the probable 
lower ranges of interrater reliability. The magnitude 
of the obtained correlations of r — .59 for dominance 
and r — .68 for closeness suggests that interrater reli- 
ability would have been considerably higher if it 
were obtainable and that the social workers' ratings 
could be treated as reasonably objective. 


Measures of Sex Typing 


Three related, though logically distinct, aspects of 
sex typing were chosen: homosexual tendencies, fan- 
tasy identification, and masculinity-femininity. A va- 
riety of measures were chosen to tap both overt and 
covert aspects of these dimensions, selection being 
based on objectivity and reliability of scoring, past 
evidence of construct validity, and pertinence to the 
dimension measured. 

Overt homosexuality. Part of the routine examina- 
lion at the Diagnostic Center is a psychiatric inter- 
view conducted under sodium amytal medication. 
During this interview, questions regarding the sub- 
jects’ past behavior fare asked which permit the 
measurement of frequency of overt homosexual ex- 
periences. Since simple division of the sample into 
homosexual and nonhomosexual groups would re- 
quire better knowledge of what constitutes a nor- 
mal degree of homosexuality than is presently avail- 
able, a simple ordinal scale of overt homosexual 
tendencies was constructed. The classifications used 
and number of subjects in the combined FA and FP 
groups falling into each class were as follows: 

: l. Frequent homosexual experiences since puberty 
n 15). 
: 2. Several homosexual experiences since puberty 
5 -—11). 
d One homosexual experience since puberty (”= 

4. Nonhomosexual but sexually deviant experi- 
ences (2 = 10).2 

5. No deviant experiences reported (n = 31). 

Wheeler Rorschach Indices. The Rorschach indices 
of homosexuality originally derived by Wheeler 
(1949) have gained construct validity as a measure 
of latent homosexual tendencies in studies by Davids, 
Joelson, and McArthur (1956), Hooker (1958), and 
Aronson (1952). In slightly revised form to improve 
reliability by considering only the subject’s first two 
responses to each Rorschach card and by scoring 
each sign only once, the Wheeler Indices were used 


as a measure of covert homosexual tendencies. The 
Sees 


? AIL subjects in this category were referred for 
cane young girls, sexually exposing themselves, 
r both. 
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interrater scoring reliability between the investi- 
gator’s scoring and that of an independent rater 
was found to be r — .86. 

VC Figure Preference Test. The Masculine Prefer- 
ence score of the VC Figure Preference Test (Web- 
ster, 1957) is a factorially derived measure of pref- 
erence for male over female sex symbols which are 
embedded in disguised form in a test of artistic pref- 
erence. Since one of the frequently reported charac- 
teristics of male homosexuals is a revulsion for fe- 
male organs or an attraction towards the male, the 
Figure Preference Test was used as a second meas- 
ure of latent homosexuality. 

TAT hero choice. As the measure of fantasy identi- 
fication, eight TAT cards depicting either a male 
and female figure together or a single figure of am- 
biguous sex were presented to the subjects, Stories 
were written in response to the usual instructions 
and were scored on a 5-point scale according to the 
degree to which the subject chose a male or female 
as the central character. Since adopting the point of 
view of a female in one's fantasy productions is im- 
plied in the construct feminine identification, the 
TAT Female Hero Choice measure was considered 
most appropriate for tapping this aspect of sex typ- 
ing. An r of .93 was obtained between the investi- 
gator's scoring and that of an independent rater. 

Masculinity-femininity. The three M-F scales of 
the Vassar College Attitude Inventory (Sanford, 
Webster, & Freedman, 1957) were used as measures 
of the masculinity-femininity dimension, These scales 
provide independent measures of three clusters of 
items which are usually not separated in conven- 
tional M-F inventories. The three scales have been 
labeled by Bereiter (1960) “Feminine Interests” 
(MF-L), “Feminine Passivity” (MF-II), and “Femi- 
nine Sensitivity” (MF-III). 


RESULTS 
Differences between FA and FP Groups 


In most instances comparisons of FA and 
FP groups were possible by straightforward 
use of ¢ tests for independent samples. Scores 
on the Wheeler Rorschach Indices were 
sharply skewed and significance of the differ- 
ences between groups was tested by means 
of the nonparametric Mann-Whitney U test. 
Data on frequency of overt homosexual ex- 
periences presented in Table 1 were analyzed 
by treating the absence or presence of the 
father as a dichotomized variable; that is, as 
a set of ranks in which all subjects were 
tied within either of the two ranks. Kendall’s 
rank-correlation coefficient tau, corrected for 
ties in both variables (Kendall, 1948), was 
computed and its significance from zero 
tested. This permitted a more powerful test 
of the association between father absence and 
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TABLE 1 


FREQUENCY OF HOMOSEXUAL EXPERIENCE IN 
FA anp FP Groups 


Father | Father 


Type of experience es) NES 
N N 


Frequent homosexual experiences 7 8 

Several homosexual experiences 4 A 

One homosexual experience 3 5 

Nonhomosexual but deviant 2 8 
experience 

Nonhomosexual, nondeviant 9 22 
experience 


overt homosexuality than the more conven- 
tional chi-square test, since the latter is in- 
sensitive to order within the variables and 
would have required combining categories be- 
cause of the small expected frequencies. 

The obtained tau of .15 for the data in 
Table 1 is not great enough to warrant re- 
jection of the hypothesis of no association be- 
tween father absence and overt homosexuality 
(LIT: 

Table 2 presents the means obtained on 
test measures of homosexual tendencies, femi- 
nine fantasy identification, and masculinity- 
femininity. In each instance, a high score is 
in the direction of deviant sex typing. None 
of the differences approach significance and 
there is no consistent pattern favoring either 
group. 

Since there was considerable variation within 
the FA group in length of father absence, an 
effort was made to determine whether this 


TABLE 2 


Comparison or FA AND FP Grours on TEST 
MEASURES OF SEX TYPING 


Father | Father 


absent | present 
R (v=25) | (N =50)| ' 
M M 


Homosexual tendencies 


Figure Preference Test 25.16 | 25.48 | —0.18 
Wheeler Rorschach 2.16 248 | —0.11 
Indices* 
Fantasy identification 
TAT Female Hero Choice} 20.50 | 20.56 | —0.23 
Masculinity-femininity 
Feminine interests 5.68 5.42 0.60 
Feminine passivity 19.84 | 19.92 | —0.07 
Feminine sensitivity 18.00 | 17.24 0.89 


a Means are presented only for comparison; significance was 
tested by the Mann-Whitney U test. 
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source of variation may have concealed dif- 
ferences between the groups. Length of father 
absence within the FA group was correlated 
with each of the measures of sex typing. Pre- 


sumably, a longer period of father absence’ 


should result in a greater deviation in sex 
typing according to identification theories, 
particularly since in the sample studied a 
shorter period of father absence usually meant 
the presence of a father during some fraction 
of the Oedipal period. The rank correlations 
obtained, however, ranged from .16 to —.17 
and none approached significance. 


Father Dominance and Sex Typing 


Since ratings for father dominance and fa- 
ther closeness were skewed and at no better 
than an ordinal level of measurement, the 
procedure of computing Kendall’s tau and 
testing its two-tailed significance was pre- 
ferred to the procedure of dichotomizing the 
FP group and thereby losing information 
about degrees of dominance and closeness. 

The Kendall rank: correlations obtained be- 
tween father dominance and the sex-typing 
measures ranged from .12 to —.15. Since none 
approached significance, the hypothesis of a 
negative association between father domi- 
nance and deviation in sex typing was not 
confirmed. 


Father Closeness and Sex Typing 


The only significant findings were obtained 
when father closeness was correlated with the 
measures of deviation in sex typing. Here, 


TABLE 3 


RANK CORRELATIONS BETWEEN FATHER CLOSENESS 
AND SEX TYPING MEASURES 


Measure Tau 

Homosexual tendencies 

Overt homosexuality hese 

Figure Preference Test :19H. 

Wheeler Rorschach Indices A8" 
Fantasy identification 

TAT Female Hero Choice —.17* 
Masculinity-femininity 

Feminine interests —.05 

Feminine passivity —.09 

Feminine sensitivity —.05 


2€ 


| 
| 
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however, the significant findings are in a di- 
rection converse to that predicted by the de- 
velopmental identification theory. Reference 


: to Table 3 shows that the greater the degree 


of father closeness, the greater the frequency 
of overt homosexual experiences. When the 
measures of latent homosexual tendencies are 
considered, the correlations are in the same 
direction and approach statistical significance 
(p < .09, two-tailed)?" Father closeness does 
not appear to affect the masculinity-femi- 
ninity dimension, and only in fantasy identi- 
fication is there any slight measure of sup- 
port for the developmental identification hy- 


‘pothesis. Here, the association between father 


distance and feminine identification in TAT 
productions is too small, however, to warrant 
rejection of the hypothesis of no association 
(p < .12, two-tailed). 


Discussion 


Contrary to the expectations based upon 
current identification theories, this investiga- 
tion failed to find significant differences be- 
tween father-absent and father-present boys 
in any of the dimensions usually related to 
sex typing. The expectation that sex typing 
might be related to the power distribution 
within the family was also unconfirmed. The 
finding that father closeness is associated with 
overt homosexuality rather than its converse 
seems to contradict the developmental identi- 
fication hypothesis and deserves further analy- 
sis. Surprisingly, of the three subjects whose 
fathers were rated as “obviously very warm 
towards, involved with, and emotionally close 
to his son,” two had engaged in frequent 
homosexual acts and one of these confessed 
to being a homosexual prostitute. None of the 
four boys whose fathers were considered “ob- 
viously cold towards, uninvolved with, and 
emotionally distant from his son" reported 
any homosexual experiences at all when ques- 
tioned under sodium amytal. 

A possible explanation of these data may 
emerge if the acquisition of heterosexuality 
were attributed to some process other than 
identification with the father. It is possible, 
for example, that learning by differential re- 
inforcement rather than by modeling may be 
More decisive in the acquisition of sex-appro- 
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priate behavior. Such a possibility seems most 
apparent when heterosexuality is considered, 
since unlike the parents’ sex-typical social be- 
havior their specifically sexual behavior is 
usually unobserved by the child and there- 
fore not subject to imitation learning. If this 
hypothesis were true, homosexuality would 
be relatively independent of those father 
characteristics which presumably facilitate 
identification with the father. 

A theory of sex typing which does not re- 
quire identification as a mediating process has 
been outlined by Colley (1959). According to 
this theory, the acquisition of appropriate sex 
typing is contingent upon what the child 
learns of the expectations which the significant 
adults and others in his life have for him. 
Thus, a moderate amount of seductive be- 
havior towards the male child by the mother 
is considered to be important as a means of 
encouraging the development of heterosexual 
approach behavior. Similarly, a certain opti- 
mum of hostile and rivalrous behavior on the 
part of the father is important in discourag- 
ing homosexual approach behavior. This the- 
ory has the advantage over derivatives of 
Freudian identification theory in not requir- 
ing different determinants of the acquisition 
process in males and females. It also circum- 
vents the thorny problem faced by all identi- 
fication theories of explaining how it is pos- 
sible for a fatherless boy to develop appro- 
priate sex typing at all. Colley points out 
that: 


Even in a father’s absence, an appropriately identi- 
fied mother will respond to a boy “as if” he were 
a male and will expect him to treat her as a male 
would treat a female. When she and her son are to- 
gether in the presence of other males she will expect 
of him some competition, hostility, and lack of 
sexuality in his relations with the other males re- 
gardless of their ages, Her interpretive approval or 
disapproval of his play with other male children 
. . . also serve to let him know what she expects of 
male with male interactions [pp. 173-174]. 


Not only does the Colley differential ex- 
pectation theory permit an alternative ex- 
planation for the absence of differences in 
sex typing between the FA and FP boys, it 
also suggests a possible explanation of the 
significant association between father close- 
ness and homosexual tendencies. It may have 
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been that those fathers rated as closest to 
their sons were also the most seductive to- 
wards their sons. A high degree of implicit 
sexualization of the father-son relationship 
would, according to the differential expecta- 
tion theory, predispose the child towards 
sexual approach behavior with males. 

A review of the case histories of the sub- 
jects used in this study gives considerable 
credence to the possibility that judgments 
of seductiveness may have been confounded 
with, or included as a major component of, 
ratings of father closeness. One of the fathers 
rated closest to his son spoke with great de- 
light about how he enjoyed washing his son 
and ministering to him when the boy was 
small. On one visit to the institution, he is 
reported to have hugged and kissed his son, 
calling him “my darling.” Another of the fa- 
thers rated as closest spoke at length of how 
important it was to him that his son love him. 
In both instances the child became involved 
in frequent homosexual acts, although in 
neither case was the child considered to be 
particularly effeminate. 

An explanation in terms of direct reinforce- 
ment rather than modeling as the process by 
which homosexual tendencies may be acquired 
offered itself repeatedly as individual case 
histories were reviewed. The only subject who 
was willing to characterize himself as a homo- 
sexual and who anticipated a future life 
within a homosexual subculture did not have 
a father; yet he attributed his own homo- 
sexuality and transvestism to his mother’s 
dressing him in girl’s clothing during his early 
childhood and curtailing his efforts at male 
behavior. Almost identical cases of the moth- 
er’s direct reinforcement of sex-atypical be- 
havior have been reported by Litin, Giffin, 
and Johnson (1956). 

Of course, no conclusions may be drawn 
from the anecdotal reports upon which these 
speculations are based. A differential rein- 
forcement theory of sex typing stands as 
much in need of experimental confirmation 
as do identification theories. Nevertheless, the 
present data suggest extreme caution in treat- 
ing sex typing solely from the point of view 
of identification or modeling theory. 


Jutes M. GREENSTEIN 
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THE PHYSIOLOGY OF CATHARSIS* 


MICHAEL KAHN 
Yale University 


36 male freshmen were angered and randomly assigned to a catharsis or non- 
catharsis condition. Physiological measures were recorded for a 20-min. recovery 
period. Catharsis Ss: (a) disliked their annoyer significantly more than con- 
trol Ss; (b) showed a slower rate of physiological recovery in skin temperature, 
PGR, skin conductance, and muscle tension (blood pressure showed a reverse 
finding) ; (c) showed more of a physiological recovery pattern than control Ss. 
It is suggested that support and encouragement from an authority, and the 
need to reduce the cognitive dissonance created by getting a person in trouble, 
increased anger and raised autonomic levels, but that a value of catharsis may 
be the replacement of an autonomic arousal pattern with an autonomic recovery 


pattern. 


The experiment to be reported here is 
primarily concerned with the reduction of 
autonomic activity which has been increased 
by experimentally aroused anger. It was de- 
signed to explore some aspects of the hy- 
pothesis that the expression of anger permits 
a more rapid reduction of aroused physiolog- 
ical activity than does the absence of such 
expression. 

The notion of aggression catharsis is deeply 
embedded in our language and our way of 
looking at the world. It is also deeply em- 
bedded in modern psychology, stemming from 
psychoanalytic theory and the frustration- 
aggression hypothesis (Dollard, Doob, Miller, 
Mowrer, & Sears, 1939). A good deal of re- 
search has been devoted to this hypothesis in 
the last 20 years, and this literature has 
been thoroughly reviewed by Berkowitz (1958, 
1962) who has also carefully delineated the 
problems involved in studying aggression 
reduction; his work will not be duplicated 
here. It will suffice to note that almost no 
controlled research has been done on this 
problem using autonomic measures as de- 
pendent variables,? and since autonomic vari- 
ables are a crucial part of the understanding 


1 Based on a Harvard PhD thesis (Kahn, 1960). 
I am very grateful to George Mandler for his con- 
siderable assistance. This investigation was supported 
in part by grants from the Foundations Fund for 
Research in Psychiatry and the National Institute of 
Mental Health, United States Public Health Service 
(George Mandler, principal investigator). Financial 
assistance was also received from the Laboratory of 
Social Relations, Harvard University. 

2A recent exception is Hokanson (1961). 
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of any aspect of aggression (Mandler, 1962), 
it seemed a logical next step in the catharsis 
research. In addition to the theoretical im 
portance of physiological measures relative to 
catharsis research, there is a logical reason 
for using them. Berkowitz points out that a 
great difficulty in interpreting much of he 
catharsis literature is that of separatin 
blocked response from reduced drive. It is 
always difficult to tell whether an observed 
reduction in aggressiveness represents all 
actual reduction of instigation to aggression 
or whether it merely represents the suppres 
sion of the aggressive response. This suppres: 
sion effect ought to be least likely when the 
measured response is one as tenuously relati 
to conscious control as is autonomic activity. 7 
There is however a new problem attendant 
upon using physiological measures. Just 
previous catharsis experiments have always 
had a difficult time distinguishing reduced 
drive from inhibited response, a physiological 
study of catharsis faces just the opposite diffi- 
culty: that of distinguishing continuing 4 
gressive drive from aggression-produced 
iety. Both should give a picture of physiolog- 
ical arousal. Previous investigators, notably 
Ax (1953), Schachter (1957), and Funken- 
stein, King, and Drolette (1957) have made 
important inroads into this problem by stud; 
ing the differences in polygraph response be 
tween frightened and angry subjects. How 
ever, the problem is not yet solved. It is tut 
that meaningful and significant differen 
can be obtained using group averages, PU’ 
the individual differences are so great and S 


8- 
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^» complex, that it is not yet possible to look at 
a subject's record and say with any degree of 
confidence what emotion he was subjectively 
"experiencing. 


Hypotheses 


There were three hypotheses generated by 
the catharsis notion. The first two are straight- 
forward: : 

I. The experimental (express) group will 
dislike the noxious experimenter significantly 
less than will the control (withhold) group. 

II. The experimental group will show, in 
its mean autonomic reactivity, a steeper re- 
covery gradient than will the control group. 

The third hypothesis is more complex and 
requires some exposition: Lacey (1950; 
Lacey, Bateman, & Van Lehn, 1952, 1953) 
has demonstrated autonomic response speci- 
ficity, that is, that subjects tend to react to 
different stresses with approximately the same 
pattern of autonomic response. That is, if a 
subject responds to the cold-pressor test with 
a certain hierarchy of response, for example, 
skin conductance > heart rate > muscle ten- 
sion > systolic blood pressure, he is apt to 
show that same hierarchy, or one very like it, 
when doing arithmetic problems. Extrapolat- 
ing from Lacey, one might assume that since 
subjects have their own stress style which is 
,Common across stresses, then they may also 
have a recovery style common across the 
recovery from different stresses. The catharsis 
hypothesis implies that the experimental con- 
dition (expression) would permit the experi- 
mental subjects to recover, but that the con- 
trol condition (withholding) would keep the 
control subjects aroused. It was therefore 
Predicted that in the anger-recovery period, 
the experimental subjects would show au- 
tonomic patterns similar to those apparent 
While recovering from the cold-pressor. The 
Control group on the other hand would not 
show recovery patterns, but on the contrary 
Would continue to show the arousal pattern, 
even though the arousal phase was in fact 
Past. This reasoning led to Hypothesis III: 

II. The experimental group will show 
Breater autonomic response specificity be- 
tween cold-pressor recovery and anger re- 
Covery than will the control group. The con- 
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trol group on the other hand will show greater 
specificity between anger arousal and anger 
recovery than will the experimental group. 


METHOD 
Procedure 


Thirty-six males, 18 or 19 years old, were used 
as subjects. All were Harvard freshmen recruited 
by a sign posted in the dining hall, and all were 
paid. A modification of the arousal situation de- 
vised by Ax (1953) and Schachter (1957) was used 
as the basis of the procedure. The experiment was 
described to the subjects as medical in nature: an 
elaborate deception precluded their associating the 
study with psychology. There were two experi- 
menters: E; who played the role of an ill-mannered 
technician and E; who played the role of the 
physician in charge of the experiment3 E; was 
always warm and supportive with the subject. E: 
was crude and vulgar, but never hostile or threaten- 
ing, pretesting having revealed that great care had 
to be taken not to frighten the subjects. When the 
subject arrived he was greeted by E; who told him 
that we were interested in obtaining measures of 
physiological activity over a period of time in which 
the subject was completely calm and unstimulated 
except for a cold-pressor test early in the procedure. 
After the polygraph electrodes were attached to 
the subject and he had rested quietly for a 20- 
minute adaptation period, he was given the cold- 
pressor test (Hines & Brown, 1932), that is, his 
foot was immersed in 4 degrees centigrade water for 
a period of 80 seconds. The purpose of this was to 
assess his autonomic recovery style following a 
simple physical stress. His foot was then dried and 
he was allowed to rest quietly for another 20 
minutes while he recovered from the cold-pressor 
arousal. After this recovery period, the experimenters 
contrived to leave him alone with E; for the anger- 
arousal phase of the experiment. Using the pro- 
fanity which had characterized his speech through- 
out the experiment, E: began to insult the subject 
while seeming to work on the recording machine. 
Throughout most of the arousal he maintained the 
friendly, rude demeanor he had established earlier 
so that he would not frighten the subject. But though 
it was never frightening, it was highly insulting. 
He accused the subject of moving around, though 
E. had previously assured the subject that he was 
an excellent subject. He wondered if someone as 
young and immature as the subject could sit still 
for the rest of the experiment. He made slighting, 
town-gown references to Harvard boys, and em- 
ployed vulgar terms of address in speaking to the 
subject. At one point he made extremely vulgar 
reference to members of the subject's family, using 
phrases which are culture-wide releasers for fighting 
behavior. In accord with instructions received 


3] am very grateful to David Berlew for serving 
as my coexperimenter. 
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earlier from Es, the subject did not speak. The 
anger-arousal phase lasted about 5 minutes.* 

Until this point neither experimenter knew 
whether the subject was to be an experimental 
(express) or a control (withhold) subject. At this 
point each subject was randomly assigned to one 
of these groups. The experimenters now contrived 
to change places so that E; the friendly experi- 
menter, was alone with the subject. If the subject 
were a control subject, E; merely checked the 
physiological measures, told the subject he was 
doing fine, and asked him just to sit a while longer 
and they would be finished. Then the subject sat 
quietly for a final 20 minutes. If the subject were an 
experimental subject, Es checked the physiological 
measures and expressed surprise and concern about 
the evident upset. He then queried the subject about 
what had happened in the room and kept probing 
until the subject had revealed a substantial part 
of the arousal procedure and had expressed his 
feelings of anger or annoyance. Then he praised 
the subject for confiding in him, stressing how 
much the subject had helped him. In this and similar 
ways he attempted to reduce the subject's aggres- 
sion guilt, He then offered the subject vicarious 
counteraggression by telling him that he would have 
Evs supervisor reprimand E; the next day. Finally 
he attempted to reduce realistic fear of retaliation 
by assuring the subject that he would hold this 
disclosure in absolute confidence, that E: would never 
know the source of his reprimand. After this he 
told the subject he was doing fine and asked him 
just to sit still a while longer and they would be 
finished. The express procedure took about 5 
minutes.5 Then the subject was given a 20-minute 
rest to recover from the arousal. During the entire 
experiment the subject’s blood pressure was taken 
every 2 minutes. Es took all the readings except 
during the anger-arousal period, when E; took the 
blood pressure. 

At the end of the final recovery period, Ea asked 
the subject to fill out a routine form. The form, 
purporting to be from the dean’s office, was anony- 
mous and the subject was assured neither experi- 


*In the postexperimental interview only 1 out of 
the 36 subjects denied having been angry. Most of 
the subjects reported having been furious and 
many reported a strong desire to hit Zi. None of 
the subjects reported having been afraid of Zi. There 
is no doubt that the arousal technique was successful. 

5 Subsequent interviewing revealed that the 
procedure was less successful in alleviating aggres- 
sion-guilt and fear of retaliation than it was in 
arousing aggression. A few subjects reported that 
in spite of their realistic knowledge that such a 
thing was highly unlikely they were nevertheless 
afraid that E: would find out and seek them out to 
hurt them. A good many subjects reported having 
felt bad about getting E: into trouble. It is clear that 
when dealing with a population whose normal re- 
sponses include a good deal of aggression-guilt and 
aggression anxiety, it takes more than procedural 
care to free them from these lifelong feelings. 


menter would ever see it. It was supposedly for 1 
purpose of allowing the dean's office to keep 
on the use of undergraduates in research pro 
Embedded in the form was the question, “How 
did you like (Ei) personally?" The question 3 
followed by a graphic rating scale. The subject w 
then undeceived and interviewed. Considerable 
was taken to explain the need for the deceptio 
the purpose of the research. 


Physiological Measures 


sound. Readings were taken every 2 minutes 
possible and the closest reading was used for 
time period. When necessary, values were 
polated being two readings. 

‘All other measures were continuously recor 
a Grass six-channel polygraph. The basic 
ment consisted of a Grass pen-recorder, dri 
standard Grass dc amplifiers. Four channels í 
used in the present study: b 

1. Heart rate (HR): Using a cardiotach 
record, the heart rate was analyzed in 14 


in the period, an analysis which Lacey (1956) 
gests reveals bursts of sympathetic activity. 
2. Psychogalvanic response (PGR) and skin col 
ductance (SC): This channel, which consists. 
record of the electrical resistance of the 
measures two separate processes which may. 
altogether different concomitants. It measures $i 
relatively stable changes in basic resistance m 
in ohms. It is convenient to transform these 
units of log micromohs of conductance and s0 
measure will be called skin conductance (SC) 
also measures fast changes where the res 
suddenly drops sharply and then usually rises” 
after a moment. It is this sudden drop in 
which is traditionally called the PGR and will 
called here. Preamplification for this condui 
PGR measure was supplied by a Fels derm 
meter which incorporated an automatic ran 
feature. The subject current is 70 microamps. V 
electrodes were used on the left palm. and left 
The SC measure represents the point of highest. 
ductance reached in a given period of time, 
PGR measure represents the number of 
(converted to a uniform rate-per-3-minute me 
in a given time period. A PGR is defined 
drop in resistance of 400 ohms or more. 
3. Temperature (FT): A Yellow Sprini 
thermometer with a thermistor probe was | 
against the ball of the middle finger. Finger 
ture was chosen over face temperature, fo 
Plutchik (1956) who presents evidence 
more responsive to recovery than is face tem] 
FT was read directly at each desired time 
4. Muscle tension (MT): A Grass electro! 
preamplifier was employed. Electrodes were 
over the subject’s left masseter. A speci Gri 
tegrator served as a cumulative recorder V 
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te automatic reset feature. The angle of the slope 


produced by this integrator reveals the average out- 
put of the muscle during any given time period. For 
purposes of this study the data from the channel 

' were recorded in terms of the number of the in- 
tegrator's resets in each time period, expressed in 
terms of resets per 3-minute period. 


RESULTS 
The Subjects Dislike of Noxious Experimenter 


Contrary to the catharsis hypothesis, the 
experimental (express) group disliked E, sig- 
nificantly more than did the control group 
(t= 2.62, p< .02, two-tailed), Thus Hy- 
pothesis I is strongly disconfirmed. 


Physiological Results 


For purposes of analysis, the experiment 
was divided into 16 time periods consisting of 
each stress peak (cold pressor and anger 
arousal) and 7 arbitrarily chosen points in 
each recovery phase. These points were the 
end of the second, fourth, seventh, tenth, 
thirteenth, sixteenth, and twentieth minutes 
of recovery. 


o 

ë These points appear on the graphs in Figures 1 
and 2 abbreviated as follows: base=end of 20- 
minute adaptation period, CP = cold pressor, CPRs 
—2 minutes into the cold-pressor recovery period, 
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Fic. 1. The raw data. 
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The raw physiological data are shown in 
Figure 1.* 

Levels of physiological activity. Following 
Lacey (1956), the physiological data were 
transformed by regression analysis into 
autonomic lability scores in order to deal with 
the Jaw of initial values by holding base line 
constant. Lacey's solution to this problem was 
extended to cover the needs of a recovery 
study by performing his regression analysis a 
second time, this time using the autonomic 
lability scores as the raw data. This second 
transformation was for the purpose of holding 
stress level constant. The moment at the be- 
ginning of recovery is taken as the base line 
for each of the recovery periods. Thus, by 
definition, the two groups are equal at the 
beginning of the two recovery cycles (CP 
and BAR). We will call these doubly trans- 
formed scores autonomic recovery scores 
(ARS): it is assumed that they make the 
recovery gradients of all subjects directly 
comparable regardless of base line or of at- 
tained stress level. ARS scores are expressed 


AA=anger arousal, BAR = begin anger recovery, 
ARa—2 minutes into the anger-recovery period. 

7 Significant differences are shown by p values on 
the graphs at the appropriate time points. 


[Inl 


(See text for explanation.) 


PGR ARS 


[er cone. "o C AR ams AREO CPRO CPR AA A ART. am» ARPCO P PRA CPRO CPAS AA ARE AAT ARID 3 

BaseT cond r tmf eon s Conso gan ane ancy ame" vase T cons ston Grn pamon pan n ana amo S AR. PEU GEGCURCEUEU ano y and 
m ES w DBP ARS — se 
e js H s 
a a E 
P 5 A 
2 L1 I1 bim 
s LI Ss » 
* a * 
^ $ ^ ^ f 
LJ LI l t T 
“a ” bd y y 
“ “ ^ L 
n «| P " 
z be z 

T ET m m m E abe Ls ea 

PECES E ane nd ra MI NTA e S 
m CECEMER 4 NT 
s Ll 2 " " 
z p 4 
" = ^ 
k a E 
M x « E 
s 0 HE = bed 3 ae a 7 
^ LJ - bad 5 7 
“ ^ cpi E a 
E RI = » EEE 
p. m m = 
s be eR. e 
2 bs ae nae 

SUIT s Pe TES EG ID 
FEN ag rcs eax E MU, 
"y 

u Di 
b 
ui 
el 
m 
"Lr 
n 
: mi 
m z ; 
^ Y 
M à 
Fi 
" 


Fic. 2. The ARS data. (See text for explanation.) 


in standard form (M = 50, SD = 10). 
Figure 2 shows the ARS scores for the six 
channels. A word is in order concerning the 
construction of these graphs, each of which 
is drawn in two separate cycles, one for cold- 
pressor recovery and one for anger recovery. 
For any given time period ARS scores were 
computed for all 36 subjects. The mean of 
these scores is 50. Then the subjects were di- 
vided into the two groups and the two group- 
mean scores for the time period were com- 
puted, These mean scores are equidistant from 
the grand mean of 50, and thus all the ARS 
graphs of the two groups are mirrored lines 
around 50. Since the variance of the popula- 
tion is known, it was possible to draw lines 
representing p values which give an automatic 
significance test by merely observing whether 
or not one of the group lines reaches one of 
the p-value lines. The farther apart the lines 
at any given time period, the greater the 
difference between the two groups, the group 
showing the higher ARS score being the most 
aroused. 

It will be seen that in spite of some am- 
biguity, Hypothesis II was significantly dis- 
confirmed by four of the six channels, finger 
temperature, PGR, skin conductance, and 


muscle tension, in all of which expression led 
to slower recovery, a finding contrary to the 
catharsis hypothesis. The clearest instances of 
this are found in the PGR and muscle-tension 
channels. In the muscle-tension ARS graph 
it will be noted that the groups were signifi- 
cantly different at three of the anger recovery 


periods though they had not differed sig, 


nificantly during the cold-pressor recovery 
(note that the line representing the express 
group is above the .05 line in the anger- 
recovery cycle, but is not in the cold-pressor 


recovery cycle). The skin conductance finding | 


is made somewhat ambiguous by the groups 
having also been significantly different during 
the cold-pressor recovery period. 

When base line and stress level are held 
constant (ARS scores), systolic and diastolic 
blood pressure supported Hypothesis TI, that 
is, expression led to more rapid recovery: 
Note, for example, that the difference between 
groups reaches the .05 level at ARas (SBP- 
ARS). 1 

The heart-rate data were inconclusive. 

The overall findings represented by Figures 
1 and 2, then, are that the physiologie 
channels for the most part disconfirme 
catharsis hypothesis with the exception of the 


the | 
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g- blood pressure channels which tended to 


= 


Ee a 


support it. 
One general finding of interest emerging 


‘from Figure 1 is the extent to which anger 


activated the autonomic nervous system rela- 
tive to the cold-pressor test. It is an impres- 
sive demonstration of the degree to which 
psychological manipulations can arouse a sub- 
ject who is sitting quietly and being neither 
hurt physically nor threatened with physical 
hurt. 


Patterns of Physiological Activity 


To test Hypothesis III the data were ap- 
proached in two ways. First each subject was 
assigned a cold-pressor recovery score and 
an anger-recovery score for each channel. 
These were arrived at by averaging for that 
channel his ARS scores for the seven recovery 
periods. Then for each group, anger-recovery 
Scores were correlated with the cold-pressor 
recovery scores, so that each group has a 
correlation (V = 18) for each channel. Hy- 
pothesis IIT predicts that the experimental 
group will have tho higher mean correlation 
(i.e, mean of the seven channels) since high 
correlations indicate communality between 
recovery from the two stresses. This prediction 
Was supported: the mean correlation (across 
all channels) for the experimental group was 
+.349 while the control group's mean cor- 
relation was +.108 (¢ = 2.36, p < .025, one- 


TABLE 1 


MEAN SPEARMAN RANK-ORDER CORRELATIONS 
REPRESENTING SPECIFICITY BETWEEN EACH 
Corp-PnEssoR RECOVERY PERIOD AND 
Irs ANALOGOUS ANGER-RECOVERY 
Pzniop (ALS Scores) 


Group 
Between ress Fen TR ASA EN t 
Pape. Control 

CPR:-AR, 210 144] 485 
CPRy-AR 421 .042 | .567 
CPRzAR; .184 439 | .308 
CPRi-ARio :200 | —.020 | 1.680* 
PRizAR;; .198 —.052 | 1.441* 
CPRie-ARig 262 —.023 | 1.914 
PRao-ARzo 028 — | —.159| 1.170 
M of each subject’s 472 .009 | 1.771** 
Seven correlations 


* 
b <.10, one-tailed test. 

tun? <05, one-tailed test; 
Ż <.025, one-tailed test. 
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tailed). In order to eliminate the possibility 
that this finding may have been due to char- 
acteristics of the groups rather than of the 
experimental treatments, analogous correla- 
tions (N —18) were computed for each 
channel between the scores obtained on cold- 
pressor arousal and anger arousal, that is, be- 
fore the groups had been treated differently. 
It was reasoned that, if this analysis showed 
no significant difference, confidence would be 
increased. The difference was in fact insignifi- 
cant: the experimental group showed a mean 
correlation of +.257 and the control group a 
mean correlation of +.225 (Tig = .323). The 
related hypothesis was that the control group 
should show greater communality from anger 
stress to anger recovery. This could not be 
tested by the present analysis because of a 
statistical artifact, but it can be tested in 
the following fashion. 

The second way of testing Hypothesis III 
was a modification of the approach used by 
Lacey in his work on autonomic response 
specificity. For each channel and for each of 
the 16 time periods (two arousal peaks and 
seven time periods) a subject may be given a 
score for his physiological activity. Thus for a 
given time period a subject has seven scores, 
one for each of the seven channels. The scores 
of a given subject for any time period can be 
ranked and for any two time periods those 
ranks can be correlated. High rank-order cor- 
relations indicate high autonomic response 
specificity. To test the difference between 
groups for a given pair of time periods, the 
correlations of all subjects in the group are 
averaged and a / test is performed between 
the two group means. All correlations were 
carried out twice, once using ALS scores and 
once using raw scores merely put in standard 
form (which, following Lacey, we will call 
ATS or autonomic tension scores) since Lacey 
has found that sometimes more light is thrown 
on the specificity question using one trans- 
formation and sometimes the other is more 
productive. 

The results of this rank-correlation analysis 
give added support to Hypothesis III, though 
not without major reservations. Table 1 shows 


3 The method of computing ARS scores demands 
that they be correlated zero with the ALS arousal 
peak. 
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TABLE 2 


MEAN SPEARMAN RANK-ORDER CORRELATIONS REPRE- 
SENTING SPECIFICITY BETWEEN ANGER-AROUSAL 
AND EACH ANGER-RECOVERY PERIOD 


(ATS Scores) 
Group 
À D 
Exe Control 
AA-ARa .648 .819 |2431*€e* 
AA-ARg 546 .766 |2.4425* 
AA-AR; .566 .800 | 2.862% 
AA-ARio +623 TAL |1.526* 
AA-ARis .664 -702 | .518 
AA-ARis .544 .713 |2.038** 
AA-AR2o 554 .724 |2.4 Te 
M of each subject's .592 .152 | 2.494 
seven correlations 


* p <.10, one-tailed test. 
** 5 = .025, one-tailed test. 

*** p < 01, one-tailed test. 
t p < 005, one-tailed test. 


the results pertaining to the prediction al- 
ready tested above, that is, that the experi- 
mental group will have higher communality 
between recovery from the two stresses. The 
prediction is again supported: all seven com- 
parisons are in the predicted direction and a 
t test between the means of all correlations 
is significant at the .05 level. This indicates 
that members of the experimental group 
tended to adopt their individual recovery pat- 
terns after anger arousal and catharsis more 
than did the members of the control group 
after mere anger arousal. 

Table 2 shows the results pertaining to 
the other part of the hypothesis, that is, the 
control (withhold) group will show higher 
anger-arousal/anger-recovery specificity. The 
prediction is supported: all seven comparisons 
are in the predicted direction and a ¢ test 
between the means of all the seven correlations 
is significant at the .01 level. This indicates 
that members of the control group tended to 
maintain their individual arousal patterns 
after anger arousal more than did the mem- 
bers of the experimental group after anger 
arousal and catharsis. 

In order to test the possibility of group 
rather than treatment differences, analogous 
comparisons were made for periods before the 
groups were treated differently. As a control 
for the results reported in Table 1, the cor- 
relations between cold-pressor arousal and 
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anger arousal were compared. As a contri 
the results reported in Table 2, however, 
comparison between the two arousals revi 
that even before they were treated diffe 
the experimental group had a signifi 
higher mean correlation, considerably weak 
ing the findings reported in Table 1. A f 
weakness of these findings is that Tabl 
reports findings based on ALS scores. 
same analysis performed with ATS 
failed to find significant differences. Simila 
Table 2 reports findings based on ATS sco 
whereas the same analysis performed 
ALS scores failed to find significant di 


between the ATS and ALS analyses proba 
lies in the fact that the ALS scores are í 
structed using one base line for the 
pressor cycle and another base line for. 
anger cycle, thus in effect, giving the t 
cycles a common base line. "Therefore; - 
specificity analyses which deal with 
of two cycles (in this case, that reporter 
Table 1) are probably more meaningf 
conducted with ALS scores to stab 
base line. On the other hand, analyses dea 
with only one cycle (here that sho! 
Table 2) are more logically done with: 
Scores. 


DISCUSSION 


Before proceeding to discussion and i 
pretation of the results it is well to note 
the experiment is a primitive early step in 
study of the physiology of catharsis and 
certain imperfectly controlled factors 
interpretation necessarily tentative. T 
pression condition, for instance, contained 
ments other than simple expression 
counteraggression, such as E»’s probing 0! 
subject for material that the subject v 
always instantly ready to divulge. 7 
problem is that presented by the groups 
different time intervals between arou 
recovery, as well as the amount of 
which filled that time; that is, the 
group talked and the withhold group did 
Thus the following discussion and conclu 
must be viewed as tentative pending 
research in which such variables as thes 
be better controlled. 

Three main findings have emerged f 


T 


ind 
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experiment: the experimental group disliked 
E, significantly more than did the control 
group; during the recovery period, the ex- 


* perimental group was generally more aroused 


physiologically than was the control group, 
except for the blood pressure channels on 
which they recovered significantly better 
(ARS scores); when patterns of physiological 
response during the recovery period were 
analyzed, the control group shows significantly 
more of a continuing arousal pattern and the 
experimental group shows significantly more of 
a recovery pattern. That is, the pattern an- 
alysis and the absolute-levels analysis yield 
conflicting findings. 

The experimental treatment can be sub- 
divided into three components: the experi- 
mental subjects expressed their feelings, they 
received support and encouragement, and they 
counteraggressed. Previous research indicates 
that when expression is not followed by any 
social feedback (Feshbach, 1955) or when the 
recipient is a peer (Pepitone & Reichling, 
1955) instigation to aggression is lowered. 
No previous study deals with the effects of the 
Subjects receiving support and encouragement 
from an authority figure for their expressions 
of feeling. It is reasonable to believe that this 
Support and encouragement led to an increase 
in aggression. Schachter (1959) has shown 
that at the level of verbal report, subjects are 
Susceptible to social influence regarding 
quality and degree of emotion. The present 
data indicate that it is not just the verbal 
report of emotion which is affected, but that 
the autonomic nervous system may be subject 
tó this same influence. 

In considering the previous research on 
Counteraggression it is important to distinguish 
between actual counteraggression and the 
mere opportunity for counteraggression. In 
almost all studies dealing with this variable 
(Rosenbaum & deCharms, 1960; Thibaut & 
Coules, 1952; Worchel, 1957) the subjects 
have to a large'extent failed to take advan- 
tage of the offered chance to counteraggress. 
In the latter two studies this opportunity 
Seems to have been aggression reducing, 
In the former there is evidence (Pirojni- 
koff, 1958) that it was aggression increasing. 
A study in which the subjects actually did 
Counteraggress is Hokanson’s (1961) in which 
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frustrated subjects gave shocks to their 
annoyer. Since Hokanson obtained no rating 
of the subject’s felt aggression between the 
time of frustration and the time of counter- 
aggression, it is not possible to tell what effect 
counteraggression had on this variable. How- 
ever, Hokanson measured systolic blood pres- 
sure and found that counteraggression pro- 
duced a reduction in this measure. Although 
that agrees with the finding of greater blood 
pressure recovery in the present experiment, 
it is puzzling to note that Hokanson took his 
measure 1 minute after the counteraggression, 
at which time our subjects were showing 
markedly elevated blood pressures (Figure 1). 
In any event, when reviewing the counter- 
aggression literature, we are faced with con- 
tradictory findings. One possible, though still 
imperfect, way of resolving them is to con- 
sider that the opportunity for counteraggres- 
sion can in some instances reduce the aggres- 
sion, whereas the actual commission of sig- 
nificant counteraggression (getting the an- 
noyer in real trouble) may serve to increase 
the aggression, as measured by the subject's 
dislike of the noxious experimenter as well as 
by significantly elevated level of autonomic 
activity. Festinger's (1957) theory of cog- 
nitive dissonance suggests an explanation for 
this phenomenon. Once the subject has 
counteraggressed and gotten E; into trouble, 
the cognition of this action would seem dis- 
sonant with the thought that Æ, was not such 
a bad guy and that the subject was not really 
very angry at him. It is not very nice to get 
a person in trouble unless you are really mad 
at him. Therefore increasing the anger toward 
him would tend to reduce the felt dissonance. 
If this analysis is indeed tenable, it adds an 
important dimension to dissonance theory 
since it extends cognitive dissonance's sphere 
of influence to the autonomic nervous system. 
Dissonance theory also suggests an explana- 
tion of the findings that passed-up opportuni- 
ties for counteraggression tend to reduce 
residual aggression. If I am offered a chance 
to counteraggress and do not take it because 
of aggression anxiety, an excellent means of 
reducing the unwelcome feeling that I am a 
coward is to reduce the amount of anger I 
feel toward the antagonist. If I am really 
not so angry, I am not a coward for passing 
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up the offered opportunity to counteraggress. 
Since the previously cited Schachter phenom- 
enon (concerning support and encouragement 
from an authority) also grows out of dis- 
sonance theory, it appears that this theory 
offers a parsimonious way of resolving several 
previously puzzling contradictions in the 
catharsis literature. Buss (1961) has at- 
tempted to deal with them by suggesting that 
catharsis is anger reducing when the subject 
is angry and anger increasing when he is not. 
But the findings of the present experiment 
clearly contradict that suggestion and raise the 
consideration that a dissonance explanation 
fits more of the data. 

One more result demands attention: the 
pattern analysis of autonomic response gives 
a finding diametrically opposed to that ob- 
tained from the levels of physiological re- 
sponse (except for the blood pressure chan- 
nels). There are of course a large number of 
possible post hoc explanations for this. Since 
no previous study has collected comparable 
data and since our own data do not permit 
us to choose among them, it is bootless to 
speculate at length. However, one possibility 
might be briefly pointed out. The function of 
catharsis might be to permit the autonomic 
nervous system to adopt a recovery pattern, 
even though it might do so at a high level of 
arousal. The absence of catharsis may delay 
the cessation of the arousal pattern, even 
though such phenomena as social influence 
and cognitive dissonance may reduce the 
level. When we recall that Worchel's (1957) 
catharsis subjects were better at a performance 
task, we may infer that suppressing an emo- 
tion interferes with performance and that sub- 
jects who allow themselves full experience and 
expression of an emotion both adopt a re- 
covery pattern and free themselves to deal 
with the environment. 
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EFFECTS OF PRIOR SUCCESS AND FAILURE ON 
EXPECTATIONS OF SUCCESS AND ` 
SUBSEQUENT PERFORMANCE + 


N. T. FEATHER 
University of New England, N.S.W., Australia 


Ss worked at a task consisting of 15 anagrams. 3 of the Ss failed (initial 
failure) and 3 succeeded (initial success) at the 1st 5 anagrams. 4 were told 
the anagrams were easier than most (high expectation) and 3 that they were 
more difficult than most (low expectation). All Ss rated their chances of success 
before attempting each anagram. The last 10 anagrams were of 50% difficulty. 
Measures of n Achievement and Test Anxiety were available. Results showed 
that mean performance on the last 10 anagrams was significantly (p < .01) 
lower after initial failure than after initial success. Probability estimates re- 
flected the pattern of success and failure and shifted more after failure than 
after success. Success-oriented Ss made more "typical" changes in probability 
estimates in the success condition; failure-oriented Ss made more of these 
typical changes in the failure condition. Performance scores correlated posi- 
tively with initial probability estimates in the high expectation-initial success 
group. Results were discussed in terms of the theory of achievement motivation 


and the transfer effects of prior experience. 


Studies of the degree to which performance 
at a task is influenced by prior success or fail- 
ure at the task present a somewhat confusing 
pattern of results. In most of these investiga- 
tions induced failure has: the overall effect of 
depressing performance (Katchmar, Ross, & 
Andrews, 1958; Lazarus & Ericksen, 1952; 
Osler, 1954; Sarason, 1956a), but this effect 
is complicated by personality variables and 
by situational factors such as the motivat- 
ing character of the instructions and the de- 
Bree of failure which is induced. Thus, using 
the Manifest Anxiety scale (Taylor, 1953), 
Katchmar et al. found that high-anxiety sub- 
Jects were affected by failure to a greater de- 
8ree than low-anxiety subjects. Lucas (1952) 
Showed that the performance of subjects low 
In manifest anxiety was facilitated by an in- 
crease in failure, and Sarason (1956b) also 
found evidence that failure may be facilita- 
üve for subjects low in manifest anxiety. 
Waterhouse and Child (1953), using an 
Interfering Tendency Questionnaire (ITQ), 
found that under failure conditions subjects 


*This research was completed while the author 
Was Visiting Scientist and Research Associate at the 
University of Michigan in the fall semester, 1963, 
under the sponsorship of the Institute of Science and 
cxthnalogy and National Science Foundation Project 


with low ITQ scores were superior in per- 
formance to subjects with high ITQ scores, 
whereas under neutral or nonfailure condi- 
tions the opposite effect occurred. Williams 
(1955) showed that different failure pro- 
cedures had different effects on performance 
when considered in relation to the achieve- 
ment orientation of subjects. These studies 
suggest that the influence of prior success and 
failure on subsequent performance at a task 
is complex and depends upon both the char- 
acteristics of the person and the particular 
situation. 

The common procedure for inducing failure 
in the above studies has involved the use of 
some form of report by the experimenter after 
the subject has completed a certain number 
of trials, for example, a report to the subject 
that he has performed badly or has fallen be- 
low some fictitious group norm. The present 
experiment explored a different procedure, 
one in which the subject could see for him- 
self whether he had succeeded or failed at an 
item and did not have to be told. This situa- 
tion is analogous to that in which a person 
taking an important examination succeeds or 
fails on the first few items. We investigated 
the effect of this prior success or failure upon 
his subsequent performance. In addition to 
examining differences we were also interested 
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in trial-by-trial changes in expectations of 
success in relation to the pattern of success 
and failure at the task. 

The present investigation was conceived 
within the context of the theory of achieve- 
ment motivation (Atkinson, 1957; Atkinson 
& Feather, 1966) which explicitly allows for 
the interaction of relatively stable personality 
dispositions (motives) and situationally de- 
fined variables (expectations and incentive 
values). Four main situations were investi- 
gated. In the high expectation-initial failure 
condition the instructions led subjects ini- 
tially to expect that the items they were about 
to perform would be easier than most, that is, 
an attempt was made to induce a high initial 
expectation of success. In fact these subjects 
failed on the first 5 items, which were in- 
soluble, and then proceeded with the 10 re- 
maining items which were so chosen as to be 
of 5096 difficulty. The low expectation-initial 
failure condition was exactly the same except 
that initial instructions implied to subjects 
that they should find the items more difficult 
than most, that is, an attempt was made to 
induce a low initial expectation of success. 
Tn the high expectation-initial success condi- 
tion instructions again implied that subjects 
should find the items easier than most. In 
this condition, however, subjects succeeded 
on the first 5 items which were so chosen 
that all subjects should be able to answer 
them correctly. They then proceeded to the 
10 remaining items of 50% difficulty. The 
low expectation-initial success condition was 
exactly the same except that in this case sub- 
jects were informed that they should find the 
items more difficult than most. We were inter- 
ested in the effect of these four experimental 
conditions on subjects’ performance on the 
last 10 items of the task and the relation- 
ship of this performance to differences in n 
Achievement and Test Anxiety. 

In designing the experimental conditions 
described above we assumed that the prior 
failure in the high expectation-initial failure 
condition would reduce expectations of suc- 
cess towards a subjective probability of suc- 
cess of .50 (P, = .50), that the prior success 
in the low expectation-initial success condi- 
tion would increase expectations of success 
towards P, = .50, that the prior failure in 
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the low expectation-initial failure condi 
would determine very low values of 
that the prior success in the high ex 
initial success condition would determ 
high values of P,. If these assumption: 
valid, it follows from the theory of a 
ment motivation that the changes in 
termined by the prior success or 
should be accompanied by changes in 
sultant tendency to perform the task. TI 
sultant tendency to perform the task & 
sponds to what we have called total mo 
tion to perform the task in earlier public: 
(Feather, 1961, 1965a). It is assumed 
the resultant of the motivation (or teni 
to achieve success at the task, the mo! 
(or tendency) to avoid failure at the 
and extrinsic motivational tendencies (eg 
tendency to undertake an act that may: 
to social approval). The tendency to 
success is interpreted as a tendency to 
take an act that may lead to success.- 
tendency to avoid failure is interpreted 4 
tendency to avoid undertaking an a 
may lead to failure, that is, it is inhi 
character (see Atkinson, 1964, pp. 28 
Atkinson & Feather, 1966). Among 
in whom Ms» Mis the resultant 
to perform the task is maximum when 
.50. Among subjects in whom Mur > 
resultant tendency to perform the 
minimum when P, = .50. Thus, in 
to Atkinson’s earlier position (Atki 
1957), the present assumptions do not it 
that a subject in whom Mur > Mg sh 
“try hardest” when P, = .50. On the | 
trary, because of the strong inhibitory ten 
ency to avoid undertaking the task v 
= .50, the resultant tendency to perto 
task would be at a minimum for sud 
subject. 

In terms of these assumptions it 
that, among subjects in whom the m 
achieve success is stronger than the mi 
to avoid failure (Ms > Mar), the resu 
tendency to perform the task after the 
5 items should be stronger in the h 
pectation-initial failure and low expet 
initial success conditions, than in the 
pectation-initial failure and high e 
initial success conditions, since P, is à 
to be closer to .50 in the former two € 
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, tions. Furthermore, this relatively strong re- 
sultant tendency should tend to remain at a 
high level, since the intermediate values of 

- Ps, which were assumed to develop following 
performance at the first 5 items in the high 
expectation-initial failure and low expecta- 
tion-initial success conditions, should not un- 
dergo marked variation because the 10 re- 
maining items were selected so as to be of 
approximately 50% difficulty. It follows that 
subjects in whom Mg > Mr should obtain 
higher scores on the last 10 items in the 
high expectation-initial failure and low ex- 
pectation-initial success conditions, if we as- 
sume that higher resultant tendency deter- 
mines superior performance. 

The above line of argument also implies 
that, among subjects in whom the motive to 
avoid failure is stronger than the motive to 
achieve success (Map > Ms), the resultant 
tendency to perform the task after the first 
5 items should be stronger in the low ex- 
pectation-initial failure and high expectation- 
initial success conditions than in the high ex- 
pectation-initial failure and low expectation- 
initial success conditions, since P, is assumed 
to be further away from .50 in the former 
two conditions, Hence, in contrast to the 
Prediction for subjects in whom Ms > Mar, 
we expected that subjects in whom Myr > 
Ms would tend to be relatively more success- 
ful on the last 10 items of the task in the low 
expectation-initial failure and high expecta- 
tion-initial success conditions, 

The two predictions stated above are valid 
only if changes in expectations of success fol- 
lowing upon performance of the first 5 items 
do occur as assumed in the four experimental 
Conditions. An important function of the pres- 
ent investigation was to study changes in 
Subjects’ ratings of expectations of success 
throughout the entire course of task per- 
formance. The need for such a controlled 
Study of expectation change has been noted 
before (Feather, 1963b) and the present in- 
vestigation attempted to fulfill this need. 

ong questions which we sought to answer 

Were thé following: Do estimates of expecta- 

tion of success change more rapidly after fail- 

Ure than after success? Is the rate of change 

4 function of the initial estimate of expecta- 

tion of success, of n Achievement, and of 
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Test Anxiety? Are there differences between 
Subjects in the degree to which "typical? 
changes in probability estimates occur after 
success and failure? 


METHOD 
Subjects 


Subjects were 96 female undergraduate students 
from an introductory course.in psychology at the 
University of Michigan in 1963. These subjects first 
wrote stories in response to six pictures which were 
presented in booklet form under neutral conditions 
using the standard procedure for obtaining stories 
(McClelland, Atkinson, Clark, & Lowell, 1953). 
These picture booklets had been previously employed 
in a national survey (Veroff, Atkinson, Feld, & Gurin, 
1960). In the same experimental session subjects also 
completed the Test Anxiety Questionnaire (TAQ; 
Mandler & Sarason, 1952).? The projective protocols 
were coded for n Achievement by an experienced 
scorer.? The TAQ was scored as in previous investi- 
gations (e.g, Feather, 1963b). The mean n Achieve- 
ment and TAQ scores for the 96 subjects were as 
follows: mean n Achievement = 4.42, SD — 524; 
TAQ = 104883, SD = 24.61. 

Both n Achievement and TAQ scores were trans- 
formed to standard scores (ie, to s scores). The 
difference between the n Achievement z score and 
the TAQ z score was calculated for each subject; 
that is, z(n Achievement) — z(TAQ). In subjects for 
whom this difference was positive it was assumed 
that Ms Mar. In subjects for whom this differ- 
ence was negative it was assumed that Mur > Ms. 


Task 


Approximately 2 weeks after this initial experi- 
mental session subjects were administered a test con- 
sisting of 15 anagrams, Eighty-seven subjects par- 
ticipated in this second session. Subjects were tested 
in groups of about 30 and the anagrams were pre- 
sented in booklet form with one anagram per page.* 
Booklets corresponding to the four experimental 
treatments were randomly distributed throughout 
the group. In both the high expectation-initial fail- 
ure and low expectation-initial failure conditions the 
first 5 anagrams were insoluble, In both the high ex- 
pectation-initial success and low expectation-initial 
success conditions the first 5 anagrams were ex- 
tremely easy and, in fact, nearly all subjects an- 
swered them correctly. In all four experimental con- 
ditions the last 10 anagrams were selected to be of 
approximately 5076 difficulty based on prior testing 


?I am indebted to Marvin Brown and Stuart 
Karabenick for administering these tests in the first 
experimental session. 

? The n Achievement protocols were coded by 
Sally Rentschler and the Test Anxiety Questionnaires 
were scored by Phil Newman. 

* This test was administered by the author with 
the assistance of Al Raphelson and Tom Triggs. 
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TABLE 1 
ANAGRAMS USED IN THE EXPERIMENT 


First 5 anagrams 
Final 10 anagrams 
(50% difficulty) 


Initial failure | Initial success 


ALSEGT RFATHE AFMERR OCRSEU 
EMAGLE MIDDEL LEIRCC ONERSP 
FESLNI VERBLA UESSNL OMERND 
UPSLON INNERD ERROPP ONEASS 
OPUSGN SECNOD AFILYM GERIDB 


under neutral conditions.) These 10 anagrams were 
randomly ordered for each subject according to a 
table of random permutations (Cochran & Cox, 
1957). Table 1 lists the anagrams used in the present 
study. 

All subjects received the following initial instruc- 
tions: 


The test that you are about to perform is a test 
of your verbal intelligence. Please try to do your 
best as your scores will be taken as a fair and ac- 
curate indication of your intelligence level. The 
test consists of a set of disarranged words (ana- 
grams). Your task is to rearrange each group of 
letters so that they make a meaningful English 
word. You will have 30 seconds to work at each 
anagram, Start when you are so instructed. Stop 
at the stop signal. Do not turn over a page until 
you are told to do so. 


Subjects in the high expectation conditions were 
then told: 


5' These anagrams were taken from a set of ana- 
grams whose difficulty level in terms of percentage 
of subjects passing each anagram, had been ascer- 
tained in a prior investigation by Marvin Brown. All 
of these 10 anagrams were between 40% and 60% 
in difficulty with an average difficulty of about 50%. 


TABLE 2 
Mean ESTIMATES OF PROBABILITY OF SUCCESS FOR ANAGRAMS OVER 15 TRIALS 
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You should find these anagrams easier than 
About 70% of College students are able to 
them correctly in the time allowed. 


Subjects in the low expectation conditions wer 


You should find these anagrams more d 
than most. About 30% of College students 
able to solve them correctly in the time alloy 


Before attempting each anagram subjects were. 
quired to check on a scale what they thought 
own chances were of solving the anagram.® 
were requested to be as accurate as possible in 
ing this judgment. These ratings were made 
5-inch scale numbered from 0 to 100 in equal $ 
of 20, with the statement “No chance at all” at 
extreme of the scale, the statement “An even cha 
at the middle of the scale, and the statement “C 
pletely certain” at the other extreme of the 
This procedure therefore provided a complete rei 
of probability estimates throughout task perfo 
ance. 5 

Fifteen subjects were randomly deleted from t 
sample so as to obtain 9 subjects with positi 
score differences (Ms Mar) and 9 subjects 
negative z score differences (Mar > Msg) in eac 
the four experimental conditions. We will te 
former subjects success oriented and the latte 
jects failure oriented. 


RESULTS 

Analysis of Probability Ratings 

Mean changes over trials. Table 2 pr 
means of the probability estimates ob 
from subjects prior to attempting each of 
15 anagrams. These means are tabi 
separately for success-oriented subject 
for failure-oriented subjects in the fo 


6 This rating was made by subjects without thet 
having seen the anagram. E 


Ir € 


Trials 


Group 
ee n M d T 
High expectation-initial failure 
Success oriented (N — 9) .61 | .49 | .41 | .31 
Failure oriented (V = 9) .70 | .53 | .38 | .30 
Low expectation-initial failure 
Success oriented (N — 9) -56 | .49 | .38 | .35 
Failure oriented (V = 9) -55 | .36 | .28 | .19 
High expectation-initial success 
Success oriented (N = 9) .67 | .66 | .69 | .74 
Failure oriented (N — 9) .54 | .58 | .66 | .70 
Low expectation-initial success 
Success oriented (N — 9) -50 | .63 | .66 | .74 
Failure oriented (N = 9) -57 | .62 | .70 | .76 


443 | .16 | .27 | .31 | .28 | .31 | -32 | .24 
22 | .21 | .29 | .36 | .51 | .45 | .52 | 42 


.29 | .23 | .33 | .33 | .35 | .33 | .31 | .35 
445 | .15 | 18 | -19 | .26 | -19 | .23 | 23 


15 | .79 | .68 | .72 | .69 | .70 | .68 | .67 
1 | .68 | .66 | .60 | .53 | .53 | .56 | .56 


74 | .78 | .70 | .60 | .59 | .61 
-78 | .80 | .69 | .75 | .74 | -65 | .70 | .69 


e 


EFFECTS OF PRIOR PERFORMANCE 


7 TABLE 3 
ANALYSIS OF VARIANCE OF PROBABILITY ESTIMATES 
Source df MS F 

Between subjects 7" 
Tnitial experience (A) 1| 267,876.001 | 40.52* 
Tnitial expectation (B) 1| 1,514.668| <1 
Motive orientation (C) 1 33.076 | «1 
AXB 1 4,396.334 | <1 
AXC 1 80.578 | <1 
BXC e. 1 191.688 | «1 
AXBXC 1| 21,789.077| 3.30 
Error (between) 64| 6,610.604 

Within subjects 1,008 
Trials (D) 14 810.341|  3.86* 
AXD 14| 4,171.689| 19.86* 
BXD 14 207.840 | «1 
CXD 14 204.315 | «1 
AXBXD 14 96.844 | <1 
AXCXD 14 129.442 | «1 
BXCXD 14 75.513 | «1 
AXBXCXD 14 220.815 | 1.05 
Error (within) 896 210.025 


perimental conditions. Table 3 presents the 
results of an analysis of variance applied to 
these data (Collier, 1958). 

Table 3 shows that the main effect of initial 
experience is highly significant (F — 40.52, 
df = 1/64, p< .001). The overall mean 
probability estimate for subjects who experi- 
enced initial failure (M = .34) is signifi- 
cantly lower than the overall mean prob- 
ability estimate for subjects who experienced 
initial success at the task (M — .65). The 

' main effect of trials is also highly significant 
(F = 3.86, df = 14/896, p< 001). Prob- 
ability estimates averaged over all conditions 
show a tendency to decrease initially from a 
Mean value of .59 and then, after five trials, 
to level out at a value of about .48. The 
highly significant interaction of initial experi- 
ence and trials (F = 19.86, df = 14/896, p 
< .001) reflects the tendency for probability 
estimates first to decrease among subjects who 
experienced initial failure and then to rise for 
the remaining 10 items of 50% difficulty. In 
Contrast, these probability estimates first tend 
to increase among subjects who experienced 
initial success and then to decrease for the 
Temaining 10 items. These changes in mean 
Probability estimates in relation to initial ex- 
perience are presented in Figure 1. All of 
these results may be taken to indicate the 
dominant influence that success and failure 
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at the task has in modifying expectations of 
Success. 

Probability estimates prior to Trial 1. Prob- 
ability estimates obtained prior to perform- 
ance at the first anagram tend to be higher 
in the high expectation conditions (M = .63) 
than in the low expectation conditions (M = 
.54). The difference between these means is 
statistically significant (F = 4.75, df = 1/64, 
$ «.05). The reported norms ( 7096 and 
30%) used to induce high versus low initial 
expectations of success thus appear to have 
produced differences in the assumed direction. 
The mean for the low expectation conditions 
suggests, however, that initial expectations of 
success may have been closer to a P, of ,50 
in these conditions rather than very low. 
Again there is a tendency for subjects to pro- 
vide initial estimates higher than the low re- 
ported norm (30%) and lower than the high 
reported norm (70%; see Feather, 1963c). 

Absolute changes in probability estimates. 
The data in Table 2 and the curves in Fig- 
ure 1 indicate that probability estimates tend 
to change more after failure than after suc- 
cess. The mean probability estimates change 
from .61 (prior to Trial 1) to .19 (prior to 
Trial 6) when the two conditions involving 
initial failure are combined, and from .57 
(prior to Trial 1) to .76 (prior to Trial 6) 
when the two conditions involving initial suc- 
cess are combined. When absolute differences 
between probability estimates for Trial 1 and 
Trial 6 are considered, the mean absolute 
difference is significantly greater for the 
combined initial failure conditions than for 
the combined initial success conditions (F — 


e— INITIAL FAILURE(TRIALS 1-5) 
ested INITIAL SUCCESS(TRIALS 1-5) 


E 


MEAN PROBABILITY ESTIMATE 
A 


$ X» n x 8 X s 


TRIALS 

Fic. 1. Mean changes in probability estimates for 

Trials 1-15 for combined initial failure groups and 
combined initial success groups. 
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TABLE 4 


MEAN NUMBER OF TYPICAL CHANGES IN PROBABILITY 
ESTIMATES FOLLOWING SUCCESS OR FAILURE 
ON First 5 TRIALS 


Initial experience Pe mE A M RUN Ee 

Initial failure 

Success oriented 2.56 2.44 

Failure oriented 3.11 3.44 
Initial success 

Success oriented 3.44 3.55 

Failure oriented 3.00 3.00 

Analysis of variance 
Source df MS F 

Tnitial experience (A) 1 2.347 1.42 
Initial expectation (B) T 425 «1 
Motive orientation (C) 1 .347 «1 
AXB 1 .014 «1 
AXC 1 7.348 4.44% 
BXC 1 125 «1 
AXBXC 1 347 «1 
Error 64 1.656 

*p <.05. 


19.30, df = 1/64, p «.001). These data, 
therefore, suggest that expectations of suc- 
cess changed more after failure than after 
success, at least insofar as magnitude of shift 
is concerned.* 

“Typical” changes in probability estimates. 
Changes in probability estimates in the dif- 
ferent experimental conditions may also be 
explored with respect to "typical" changes 
following success and failure. Probability 
estimates may be raised, maintained, or low- 
ered after success or failure at an anagram. 
Following usage in the level of aspiration lit- 
erature (Lewin, Dembo, Festinger, & Sears, 
1944), we define a typical change in a prob- 
ability estimate as one in which the estimate 

7 The amount of absolute shift is lowest in the 
high expectation-initial success condition and highest 
in the high expectation-initial failure condition. The 
interactive effect of initial experience and initial ex- 
pectation on absolute shift is significant (F — 5.52, 
df = 1/64, ~<.05). This result may be due to a 
“ceiling effect” since the rating scale is limited to 
the range from 0 to 100. 

8In a study of the effect of success and failure 
upon generalization of expectancy Marvin Brown 
has also found that the effect of failure is greater 
than the effect of success in terms of absolute amount 
of shift (personal communication). Heath (1961, 
1962) also reports similar results in studies of ex- 
pectancy generalization. 
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is raised after success or lowered after fail- 
ure. The maximum possible number of typical | 
changes for the first 5 trials of the anagrams 
test is 5. A score of 5 in the initial failure 
condition would indicate that the subject re- 
duced her probability estimate on each of the 
5 occasions when she failed. Similarly, a score - 
of 5 in the initial success condition would in- 
dicate that the subject increased her prob- 
ability estimate on each of the 5 occasions 
when she succeeded. Table 4 presents the 
mean number of typical changes in prob- 
ability estimates for the first 5 trials of the 
task for success-oriented and failure-oriented 
subjects in the four experimental conditions. 
In the lower portion of Table 4 are the re- 
sults of an analysis of variance applied to” 
these data. Table 5 presents the correspond- 
ing data and analysis for typical changes in 
probability estimates occurring over the last 
10 trials of the anagrams test. In this case 
the maximum possible number of typical 
changes for a subject is 9 since subjects were 
not required to report a probability esitmate 1 
after the last item (Trial 15). { 

Table 4 shows that the interactive effect of Y. 
initial experience and motive orientation OD ' 
typical changes is significant (F = 4.44, df 
= 1/64, p < .05). Under conditions of uni- - 


TABLE 5 


Mean NUMBER OF TYPICAL CHANGES IN PROBABILITY. 
ESTIMATES FOLLOWING SUCCESS AND FAILURE 
on Last 10 TRIALS 


Initial experience Bon E 
Initial failure 
Success oriented 3.67 3.33 
Failure oriented 4.89 4.78 
Tnitial success 
Success oriented 4.89 5.78 
Failure oriented 5.44 3.67 
Analysis of variance 
E 
F 
Source df MS EU 
Tnitial experience (A) 1 10.889 1. 
Initial expectation (B) 1 200 | € 4 
Motive orientation (C) 1 1389 | € 
AXB 1 222| <1 b 
AXC 1 20.056 136 
BXC 1 7.122 1. i 
AXBXC 1 8.389 L 
Error 64 5.705 1 


ee 
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, form failure on the first 5 trials of the ana- 
grams test, success-oriented subjects show a 
lower mean number of typical changes in prob- 


. ability estimates (M = 2.50) than do failure- 


‘oriented subjects (M = 3.28). In contrast, 
under conditions of uniform success on the 
first 5 trials, success-oriented subjects show 
a greater mean number of typical changes 
(M = 3.50) than do failure-oriented subjects 
(M = 3.00). This trend is also apparent in 
Table 5 for typical changes over the last 10 
trials of the anagrams test but the A Xx C in- 
teraction is not statistically significant. It ap- 
pears then that success-oriented subjects are 
more likely to make typical changes in prob- 
ability estimates under success conditions 
than under failure conditions, and that fail- 
ure-oriented subjects are more likely to do 
the reverse. 

Probability estimates and motive orienta- 
tion. Mean probability estimates for Trial 1 
do not differ significantly between success- 
oriented and failure-oriented subjects, nor do 
these subjects differ with respect to absolute 
changes in probability estimates from Trial 1 
to Trial 6. In the present study subjects had 
fairly precise information about the task be- 
fore they made their first probability estimate 
and thereafter their probability estimates 
were closely determined by their task per- 
formance. Under these conditions relation- 
ships of probability estimates to measures of 
n Achievement and Test Anxiety would be 
attenuated (Feather, 1965b). 

Probability estimates prior to Trial 6. Fi- 
nally, it is apparent that mean probability 
estimates prior to Trial 6 are very low in the 
low expectation-initial failure condition and 
very high in the high expectation-initial suc- 
cess condition. These data imply that subjects 
in the former condition had very low expecta- 
tions of success just prior to beginning the last 
10 items and that subjects in the latter con- 
dition had very high expectations of success 
Just prior to attempting these items. This 
agrees with assumption. However, it is also 
Apparent from Table 2 that mean probability 
estimates prior to Trial 6 are not close to .50 
as assumed for the high expectation-initial 
failure and low expectation-initial success 
conditions, In fact these estimates are very 


- low for the former condition and very high 
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TABLE 6 


MEAN NUMBER OF ANAGRAMS CORRECTLY ANSWERED 
OUT OF FINAL 10 ANAGRAMS 


qe 4 High Li 
Initial experience expectation expectation 

Initial failure 

Success oriented 3.89 4.22 

Failure oriented 5.00 5.67 
Initial success 

Success oriented 6.00 6.11 

Failure oriented 5.78 6.44 

Analysis of variance 
Source df MS F 

Initial experience (A) 1 34.722 8.38* 
Initial expectation (B) 1 9.555..] «1 
Motive orientation (C) 1 8.000 1.93 
AXB 1 .056 | «1 
AXC 1 6.722 1.62 
BXC 1 .889 | <1 
AXBXC 1 056 | «1 
Error 64 4.142 

*p «.01 


for the latter condition. This result implies 
that subjects in the high expectation-initial 
failure condition had very low expectations of 
success just prior to beginning the remaining 
10 items, and that subjects in the low expec- 
tation-initial success conditions had very high 
expectations of success just before attempt- 
ing these items. These data therefore suggest 
that an important assumption involved in the 
present study is untenable, namely, that sub- 
jects in these last two experimental conditions 
would begin the remaining 10 items of the 
test with intermediate expectations of success. 

The violation of this assumption makes in- 
terpretation of the performance data rather 
difficult in terms of the present theoretical 
approach. However, we will now turn to an 
examination of these data. 


Analysis of Performance Data 


Mean number of anagrams solved. Table 6 
presents the mean number of anagrams cor- 
rectly answered out of the final 10 anagrams 
by success-oriented and failure-oriented sub- 
jects in the four experimental conditions. In 
the lower portion of Table 6 are the results 
of an analysis of variance applied to these 
data. 

Table 6 shows that subjects who failed at 
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the first 5 anagrams were less successful at 
the remaining 10 anagrams than were sub- 
jects who succeeded at the first 5 anagrams. 
Mean number of anagrams correctly answered 
is 4.69 after initial failure and 6.08 after 
initial success. The main effect of initial ex- 
perience is significant (F — 8.38, df — 1/64, 
p < 01). There are no other significant ef- 
fects in Table 6.° Table 7 shows the number 
of subjects who passed Items 6-15. It is ap- 
parent that less subjects passed these items 
after initial failure than after initial success 
and that this difference is maintained over all 
of the final 10 items, There are no differences 
between success-oriented and failure-oriented 
subjects when an analysis of correct responses 
on Trial 6 is made in relation to probability 
estimates just prior to Trial 6 for each of the 
four experimental conditions. 

Performance in relation to probability esti- 
mates. 'Table 8 presents the correlations be- 
tween performance scores on the last 10 
anagrams and probability estimates obtained 
from subjects prior to Trial 1, Trial 6, and 
Trial 15. The correlations between perform- 


9 Success-oriented subjects do not perform better 
in the low expectation-initial failure condition than 
in the high expectation-initial success condition, nor 
do failure-oriented subjects perform significantly 
better in the high expectation-initial success condi- 
tion than in the low expectation-initial failure con- 
dition. The results of the present study therefore do 
not support Weiner's (1965) predictions based on 
the assumption of persisting inertial tendencies. Nor 
does the pattern of results change when performance 
Scores of more extreme groups of success-oriented 
and failure-oriented subjects are examined. 
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ance and probability estimates obtained prior 
to Trial 1 and Trial 6 are significant only in 
the high expectation-initial success condition, 
The high significant positive correlations be- 
tween performance and probability estimates 
obtained prior to Trial 15 for the high ex- 
pectation-initial failure, high expectation-ini- 
tial success, and low expectation-initial suc- 
cess conditions reflect the dominant role of 
success and failure in influencing subjects’ re- 
ported probabilities. These results are consist- 
ent with previous findings (Feather, 1965b). 
The corresponding correlation is not, how- 
ever, significant in the low expectation-initial 
failure condition. 


. 


DISCUSSION 


di 
The main contribution of the present study 
is its detailed investigation of changes in 
probability estimates during the entire course 
of task performance. Such an analysis has 
not been made previously although it is ob- 
viously useful to have information of this f 
type to check whether assumptions about ex- 
pectations of success (P,) involved in predic- 
tions about performance appear to have been 
met when the theory of achievement motiva- 
tion is employed. The results again indicate | 
the dominant influence of success and failure 
at the task in shaping judgments of prob- 
ability and, by implication, expectations of 
success (cf. Feather, 1963c, 1965b). They | 
also show that probability estimates change | 
more after failure at the first 5 items than J 
after success at the first 5 items, and there- ‘ 


TABLE 7 f 4 
NUMBER OF SUBJECTS SOLVING ANAGRAMS ON TRIALS 6-15 


Trials 


Group 


= 
a 


e 
5 


10 


co 
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High expectation-initial failure 
Success oriented (N = 9) 
Failure oriented (N = 9) 

Low expectation-initial failure 
Success oriented (N — 9) 
Failure oriented (N — 9) 

High expectation-initial success 
Success oriented (N — 9) 
Failure oriented (N — 9) 

Low expectation-initial success 
Success oriented (N — 9) 
Failure oriented (N — 9) 
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TABLE 8 


CORRELATION OF PERFORMANCE SCORES on Last 10 
ANAGRAMS WITH PROBABILITY ESTIMATES 
Prior TO TRIALs 1, 6, AND 15 


Probability estimate 


Group 
Trial 1 | Trial 6 | Trial 15 
r r r 


High expectation-initia], 


failure (V = 18) —.01 | .05 «70 
Low expectation-initial 

failure (V = 18) —.05 | .22 23 
High expectation-initial 

success (N — 18) Ad* | .65** | 821 
Low expectation-initial 

success (N — 18) —.9 | .27 Nd 


* p <.10, two-tailed. 
**) <.01, two-tailed. 


fore suggest that total change in expectation 
tends to be greater following uniform failure 
than following uniform success when the 
number of failures and successes is the same 
and the initial expectation is intermediate 
(see Figure 1). The results also indicate that 
Success-oriented subjects tend to make more 
"typical" changes in their probability esti- 
mates under conditions of success than under 
conditions of failure, while typical changes 
are relatively more frequent for failure-ori- 
ented subjects under conditions of failure. It 
seems then, that the expectations of success- 
oriented subjects are more subject to typical 
changes following success experiences than 
following failure experiences, and that the 
Teverse is true of the expectations of failure- 
oriented subjects. This difference in respon- 
siveness to success and failure may be a func- 
tion of differences in past experience. If suc- 
Cess-oriented subjects have in the past been 
involved more frequently than failure-ori- 
ented subjects in test situations in which 
they have succeeded, success would be a more 
familiar experience to them and typical 
Changes in their expectations may therefore 
be more likely following success. Similarly, 
failure-oriented subjects may have had more 
experience with test situations involving fail- 
ure than success-oriented subjects. Hence, 
failure would be a more familiar experience 
to them, and typical changes in their expec- 
tations may be more likely following failure. 

he assumption involved in this argument is 
that typical changes in expectations of suc- 
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cess are more likely to follow familiar experi- 
ences than unfamiliar experiences. 

The changes in probability estimates re- 
ported in the present study may be compared 
with the results of investigations by Heath 
(1959, 1961, 1962) and Diggory and Mor- 
lock (1964). Heath was interested in the ef- 
fects of success or failure at one task on ex- 
pectations of success for other tasks increas- 
ingly dissimilar from the original task. His 
investigations were concerned with expect- 
ancy generalization within the context of 
reinforcement learning principles (Rotter, 
1954), and are continuous with earlier studies 
of transfer effects of prior experience on levels 
of aspiration (cf. Jucknat’s study in Lewin 
et al., 1944). In contrast, the present investi- 
gation has studied changes in probability esti- 
mates for the one type of task as a function 
of success and failure. Diggory and Morlock 
have investigated the type of situation in 
which the subject’s goal is to attain some 
fixed level of performance in a given period 
of time or over a prescribed number of 
trials. Diggory, Riley, and Blumenfeld’ (1960) 
showed that subjects’ estimates of their 
chances of attaining the fixed level of per- 
formance tended to decrease as the deadline 
was approached, that is, as time or trials pro- 
gressed. In the present study subjects were 
not required to estimate their chances of at- 
taining a fixed performance level. Their esti- 
mates were in terms of chances of solving the 
next item in the series of items. Nor was 
there any requirement in the instructions that 
subjects attain a fixed level of performance 
over the 15 trials of the task. The results of 
these studies by Heath and by Diggory and 
his associates are, however, consistent with 
those of the present investigation in demon- 
strating the important role of success and 
failure in determining expectations of success. 

Although the analysis of probability esti- 
mates in the present investigation implies that 
the conditions for testing the specific predic- 
tions about differences in performance be- 
tween the groups were not met,!° the analysis 


10 Further studies might reduce the number of 
initial successes or failures or use more extreme re- 
ported norms (eg., 10% and 90%) in an attempt 
to determine expectations of success close to .50 
among subjects in the high expectation-initial failure 
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of the performance data does show that initial 
experience has a significant effect on subse- 
quent performance. We have already referred 
to other similar results in the literature, and 
indicated that the procedure used in the pres- 
ent study differs from those used previously 
in that subjects were able to see for them- 
selves whether they had succeeded or failed, 
and did not have to be told by the experi- 
menter. Future studies employing this pro- 
cedure might investigate the effect of differ- 
ences in the number of prior successes or fail- 
ures, or differences in the pattern of prior 
successes and failures, upon subsequent per- 
formance. 

There are at least two possible interpreta- 
tions of the difference in performance ob- 
tained in the present investigation. In the first 
place the result could be taken to imply that 
the resultant tendency to perform the remain- 
ing 10 anagrams was lower among subjects 
who underwent initial failure than among 
subjects who underwent initial success. Such 
a difference in resultant tendency may be de- 
rived from a fixed incentive model (Feather, 
1963a) which provides a basis for predicting 
a progressive increase in the tendency to 
avoid failure as a subject fails at a task, 
this increase occurring at a faster rate when 
initial expectation of success (P,) is high and 
when the motive to avoid failure (M4r) is 
relatively strong. Such an increase in the 
tendency to avoid failure would tend to re- 
duce the resultant tendency to perform the 
task after repeated failure. This model im- 
plies that the resultant tendency to perform 
the remaining 10 anagrams should be lowest 
among failure-oriented subjects in the high 
expectation-initial failure condition. However, 
this group does not have the lowest mean per- 
formance. In fact, Table 6 shows that mean 
performance scores in the initial failure con- 
ditions are higher for failure-oriented subjects 
than for success-oriented subjects although 


and low expectation-initial success conditions before 
these subjects perform the anagrams of 50% diffi- 
culty. 

11 The results of the present study have obvious 
relevance for examination techniques in situations 
where an examinee is confronted with a set of items 
of the same type. The results suggest that a wise 
strategy would be to attempt easy items first and 
so build up a sequence of initial successes. 
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these differences are not statistically signifi- * 
cant. 

A second interpretation of the obtained 
difference in performance is that subjects in 
the initial success conditions who solved the 
first 5 anagrams may have learned appro- 
priate ways of dealing with the anagrams 
which facilitated their performance on the 
remaining 10 anagrams. Although there were 
no set rules for solving the anagrams, sub- 
jects in the initial success conditions may 
have acquired some general responses facili- 
tating solution. Subjects in the initial failure 
conditions would not have had this oppor- 
tunity. This second interpretation therefore 
stresses the effects that prior experience may 
have in developing appropriate responses 
rather than in producing changes in tenden- 
cies. It is likely that future extensions of the 
theory of achievement motivation will need 
to include a concept of response availability 
to take account of the role of previously ac- 
quired habits in influencing performance (At- 
kinson & Feather, 1966). 

Results in Table 8 show that, in the high 
expectation-initial success condition, perform- 
ance scores are positively related to prob- 
ability estimates obtained prior to Trial 1 
and Trial 6. In this situation subjects who 
expected to work at an easy task did in fact 
experience initial success. The correlations be- 
tween performance scores and probability 
estimates prior to Trials 1 and 6 are not sig- 
nificant in the other experimental conditions. 
In a previous study (Feather, 1965a) we 
found that initial probability estimates and 
performance scores at a moderately difficult 
task were positively related among subjects 
who were told that the task would be harder 
than most. These measures were unrela 
among subjects who were told that the same 
task would be easy. The following pattern of 
results therefore seems to be emerging: In- 
structions given to subjects may set the gen- 
eral difficulty level of tasks initially, such 
that expectations of success all tend to cluster 
about a certain value for a given set of in- 
structions. Evidence shows that performance 
is related to this general difficulty level in 
that subjects work harder at tasks for which 
their chances of success are intermediate 
rather than very high or very low (cf. Atkin- 
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- son, 1958). For the one set of instructions, 


however, in which a general level of difficulty 
is established, performance at a task tends to 


> be positively related to initial probability 


estimates as long as the difficulty of the task 
is truthfully represented to subjects. If the 
difficulty of the task is misrepresented to sub- 
jects then task performance and initial prob- 
ability estimates are unrelated. 

Why should this be so? The theory of 
achievement motivation has some difficulty in 
coping with this result. If we assume that the 
majority of undergraduate students tend to 
be success oriented (Ms > May) then we 
would predict a positive relationship be- 
tween task performance and initial prob- 
ability estimates when a difficult task is 
presented as difficult, providing that high 
initial probability estimates in this condition 
reflect expectations of success close to P,— 
:50. But this theory would also imply a nega- 
tive relationship between task performance 


. and initial probability estimates when an easy 


task is presented as easy, providing that low 
initial probability estimates in this condition 
teflect expectations of success close to P= 
50. In a previous paper (Feather, 1965a) 
we examined an alternative interpretation, 
namely, that, where the actual difficulty of 
the task is consistent with instructions, a sub- 
Ject may feel committed to the probability 
estimate he states and may work to justify 
it in performance. Where the actual difficulty 
of the task is inconsistent with instructions a 
Subject may feel that his probability estimate 
Was based on misleading information and that 
there is less need for justifying it in per- 
formance, 

There is, however, a third interpretation of 
the tendency for performance scores to re- 
late positively to initial probability estimates 
When the actual difficulty of the task is truth- 
fully represented to subjects. Information in 
the instructions about the task provides sub- 
jects with the basic cues for making their 
judgments of probability of success, If the 
task is truthfully represented to them in the 
instructions they can draw upon their past 
experience at similar tasks of that general 
difficulty level to guide them when they esti- 


_ Rate their chances of success for the present 


task. In this case a positive relationship be- 
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tween initial probability estimates and task 
performance would reflect an underlying simi- 
larity between levels of performance in simi- 
lar tasks undertaken in the past and levels 
of performance in the present task. Thus the 
subject who has done well at similar tasks in 
the past may estimate his chances of Success 
as high and may also tend to do well in the 
present task. Similarly the subject who has 
done poorly at similar tasks in the past may 
estimate his chances of success as low and 
may also tend to do poorly in the present 
task. Where the task is misrepresented in the 
instructions to subjects their initial prob- 
ability estimates will again be in terms of in- 
formation about the task as defined by the 
instructions. In this case, however, past per- 
formance in the task defined by the instruc- 

tions may not be an adequate guide to how 
well the subjects perform in the actual task 

they undertake. Thus, where cues in the in- 

structions are misleading, performance scores 
may be unrelated to initial probability esti- 

mates, This interpretation, which stresses the 
tendency for subjects to refer to their past 

performance in similar situations when esti- 

mating their chances of success, also clarifies 

relationships obtained between initial prob- 

ability estimates and measures of n Achieve- 

ment and Test Anxiety (Feather, 1965b). It 

therefore has a fairly wide application. 


12 This interpretation has difficulty in accounting 
for the lack of a significant positive correlation be- 
tween performance scores and initial probability esti- 
mates for subjects in the low expectation-initial 
failure condition. The initial failure may be the vari- 
able here that upsets the relationship. Certainly re- 
sults have shown (Feather, 1963a) that strategies 
following failure differ from those following success, 
and that performance tends to be more variable 
after failure (Lazarus & Eriksen, 1952) and perhaps 
from the subject's viewpoint therefore less pre- 
dictable. 
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GENERALIZATION OF ATTITUDE CHANGE THROUGH 
CONGRUITY PRINCIPLE RELATIONSHIPS * 


PERCY H. TANNENBAUM anp ROY W. GENGEL ? 


University of Wisconsin 


3 source-concept linkages were 1st established—1 in which a source favors 
the concept, a 2nd in which a source is unfavorable, and a 3rd neutral relation- 
ship. In aeseparate message, the concept attitude is made either positive or 
negative. Predictions are derived from the principle of congruity to account 
for any subsequent attitude change toward the 3 sources in all conditions. 
Although some of the changes are not in the predicted directions, the results 
support the prediction with regard to relative degree of attitude change 


toward the source. 


Considered as learned response tendencies 
(cf. Doob, 1947; Hovland, Janis, & Kelley, 
1953), attitudes may be acquired through in- 
direct associations as well as by more direct 
means, For example, the affective attributes 
of a highly evaluated word or concept can 
be transferred to other verbal stimuli through 
classical conditioning procedures (Das & 
Nanda, 1963; Staats & Staats, 1958; Staats, 
Staats, & Heard, 1959). A related phenomenon 
involves the generalization of attitude change 
from one social object, the attitude toward 
which has been manipulated, to another ob- 
ject, where no direct manipulation is involved. 
Such generalization of persuasion can occur 
along a variety of dimensions of relationship 
between the manipulated and nonmanipulated 


* objects—for example, between logically re- 


lated elements of syllogisms (McGuire, 1960), 
or from one health belief to another as a 
function of degree of affective similarity 
. (Tannenbaum, 1964). : 

The principle of congruity (Osgood, Suci, 
& Tannenbaum, 1957; Osgood & Tannen- 
baum, 1955) may be considered as represent- 
ing a special case of generalization of attitude 
change. In its initial conception and experi- 
mental test (Osgood & Tannenbaum, 1955; 
Tannenbaum, 1956), the particular persuasion 
setting involved a communication message 1n 
which an identifiable source makes an asser- 


1 This research was supported under Grant G-23963 
from the National Science Foundation to the senlor 
author, and was the basis of the junior authors 
master's thesis. 

2 Currently at the Central Institute for the Deaf, 
St. Louis. 


tion for or against a given concept. In ad- 
dition to the anticipated attitude change 
toward the concept—the main target of the 
message—there may also occur some change 
in attitude toward the source, which is merely 
mentioned in the message and not directly 
manipulated. According to the theory, this 
indirect modification of the source attitude? 
is an integral part of the mechanism for re- 
solving an incongruous situation and render- 
ing it more congruous. Indeed, under certain 
circumstances (e.g., a neutral source making 
a favorable assertion about an extremely 
favorable concept), it is the source and 
not the concept which absorbs most of the 
pressure toward change. 

In terms of a generalization paradigm, 
there are two main cognitive operations in- 
volved in such communication situations—the 
establishment of a directed relationship be- 
tween the source and concept, and the ma- 
nipulation of the concept attitude, In the 
above situations, these two operations occur 
together in the same message, and are in- 
extricably linked in that the source's assertion 
both establishes his relationship to the con- 
cept and serves to manipulate the concept 
attitude. It is thus impossible for the two 
operations to be in opposition—for example, 
we cannot have the source against the con- 


3 As used here, the term “source attitude" is in- 
tended to represent "the subject's attitude toward 
the source,” which is rather cumbersome to use 
repeatedly. It is mot intended to mean the source’s 
attitude toward the object. Similarly, the term 
“concept attitude” is used as a more convenient way 
of saying “the subject’s attitude toward the concept.” 
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cept and yet have the concept manipulated 
favorably. 

Such opposing situations are accommodated 
in the present experiment in that the two 
critical steps are performed independently of 
one another—the source-concept link is first 
established, and then the concept attitude is 
altered without mention of the source. Pre- 
dictions are readily derivable from congruity 
theory to account for the subsequent generali- 
zation of attitude change from the concept 
to the source: As long as the initial source- 
concept linkage is recognized, the mainte- 
nance of congruity would involve an adjust- 
ment of the source attitude after the concept 
attitude has been altered. 

For example, if the source originally favors 
the concept and the latter is then made favor- 
able, a congruous situation would obtain if 
the source were also favorable, and thus a 
favorable shift in source attitude would be 
predicted. On the other hand, if the source- 
concept linkage was a negative one to begin 
with, a favorable shift on the concept should 
result in an unfavorable change in source 
attitude. If the initial linkage were neutral, 
we would expect the source attitude to be 
unaffected by a concept change. Similar but 
opposite predictions apply when the concept 
attitude is made unfavorable following the 
three respective types of linkages: If the 
source initially favored the concept, it should 
then become unfavorable. If the connection 
was a negative one, the source would change 
in a favorable direction. Again, a neutral 
source-concept linkage should result in no 
appreciable attitude change toward the source. 


METHOD 
General Procedure 


The attitudes of all subjects on a number of 
potential sources and concepts from the field of 
psychology were first ascertained “as part of a 
project to assess students’ judgments (of) individuals 
and practices which are of some . . . controversy in 
psychology today.” This pretesting allowed for the 
selection of one concept and three sources as the 
attitudinal objects. The criteria of selection were 
more neutral mean attitude ratings and smaller vari- 
ances, on the grounds that initially neutral objects 
would allow for maximizing the degree of attitude 
change (cf. Tannenbaum, 1956), This initial session 
also served to provide a base-line measure (To) of 
the selected attitudinal objects—the concept “teaching 


| 


machines,” and three sources identified as Dr. George . » bs 


L. Maclay, Professor of Psychology, Cornell Uni- 
versity; Dr. Walter E. Samuels, Professor of Psy- 
chology, University of California; and Dr. Kenneth 


W. Spence, Professor of Psychology, University of , ' 


Iowa. * 

In a second test session, 2 weeks after the initial 
one, all subjects were exposed to the same linking 
message designed to establish the three types of 
source-concept relationships—one source (Maclay) 
favoring the concept (the L+ linkage condition), 
another (Samuels) being against teaching machines 
(L— condition), and the third (Spence) occupying 
the neutral stance (Lo condition).5 After reading this 
linkage message, subjects were exposed to a second 
communication, half the subjects receiving a positive 
message to make for a favorable concept evaluation 
(P treatment), the other half a negative message to 
make the attitude unfavorable (N treatment). 
Attitudes toward all four attitudinal objects were 
again assessed after the concept treatment messages 
(T). 

It would have been preferable to have a longer 
interval between the two message exposures in order 
to make the source linkages and concept treatments 
more clearly separate from one another. However, 
pretesting had indicated that the neutral attitudinal 
Objects and their respective interrelationships could 
be readily forgotten—particularly which source was 
for or against the concept—with a much longer time 
Span than the 5-10 minutes employed here. Since 
original neutrality of attitudes was employed to 
assure an adequate degree of attitude change on the 
concept and subsequently on the sources, the present 
procedure was adopted as the lesser of two evils. 
It may be noted that both in the pretesting and in 
some postexperimental interviews, most subjects 
regarded the two messages as being independent of 
one another—and it is this factor, and not the time 
interval as such, which is of major importance here. 


Subjects 


Male and female students in an introductory 
course in psychology at the University of Wisconsin 
served as the subjects in the experiment. The first 
test session was conducted during a regular lecture 
meeting of the class, with 247 students participating. 
The second test session was conducted during regular 


‘The first two of these "psychologists" are 
fictitious, as far as we know, The third, Dr. Spence, 
was included to add a note of authenticity to the 
rating task so as not to arouse any undue suspicions 
about rating totally unfamiliar names. His name was 
slated to represent the neutral source and little if any 
attitude change toward this actual person was antici- 
pated. Initial ratings of Dr. Spence were close to 
neutral. 

5 An initial, perhaps more rigorous design, calling 
for the use of independent groups differentially linked 
on the same single source, had to be abandoned 
because subjects were not available for the type of 
testing involved. 
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ly meetings of six course quiz sections, which 
were randomly assigned to the two experimental 
_ treatments. Only the data of subjects who were 
- present in both the lecture and the selected quiz 

T (section meetings (N = 124) are included in the main 
— analysis, although the data of all 247 students in the 
-— first testing were included in the concept selection 
procedure. 


"AS Experimental Materials 


A - Linkage message. The various source-concept rela- 

— tionships were established in a single 300-word mes- 

— sage, which purported to tell the students “a little 

ry more” about the three psychologists since they 

— seemed to be “totally unfamiliar with them” at the 
Be first testing. The message proper reported on an 
—— Ostensible symposium at the 1961 convention of the 

j fa American Psychological Association dealing with 
a 


“applications of psychological theory and research 
i fo education, particularly the use of teaching ma- 
Chines for college instruction." Prof. Maclay was 
identified as a “foremost proponent of teaching 
machines,” and was quoted as claiming them to be 
"the most significant single contribution of psychol- 
Ogy to the field of education." Prof. Samuels was 
Cited as long being a spokesman opposed to teaching 
— Machines, calling them “a most disrupting influence 
in the entire educational system, and a source of 
- shame for psychologists.” In both instances, only 

Such general commenjs for or against the concept 

f - Were included in the message, since this particular 

‘experimental treatment was not intended to produce 

[o any attitude shift toward the concept. Prof. Spence 

F Was identified as the chairman of the symposium, 

“one of the principal authorities in learning theory 
a. . a neutral bystander on this occasion." 

Concept attitude manipulations. Both the positive 

and negative treatments took the form of a copy 

» Of an Associated Press article on a “comprehensive 

Teport on teaching machines from the U. S. Office 

Of Education.” The articles themselves were about 

equal in length (approximately 475 words each) and 

Very similar in format. Each reported a generally 
favorable or unfavorable position of the Office of 
“Education on the subject of teaching machines, and 
Specifically cited a half-dozen or more arguments for 
_ OF against their use in college instruction, with liberal 
Use of quotations from the ostensible report. Both 
_ Messages were largely one-sided, not citing any argu- 
- Mentation opposing their respective positions. In 
. Béither was there any mention whatsoever of the 
_ three psychologists named in the linking message. 


—— Attitude Measure 


Attitude toward each object was assessed on a 
‘Set of four 7-point semantic differential scales 
~ (Osgood et al., 1957). A total of 15 such scales was 

Used at both sessions, The 4 scales representing the 
Attitude measure were selected on the basis of a 
———__ 
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TABLE 1 


RESULTS OF REPEATED MEASURES ANALYSIS OF 
VARIANCE ON ATTITUDE CHANGE TOWARD 
THE VARIOUS SOURCES 


Source df MS F 

Between subjects (123) 

Manipulations (B) 1 43.67 | 1.78 

Subjects/manipulation| 122 24.53 
Within subjects (248) 

Sources (à) 2 36.58 | 1.63 

AXB 2 196.58 | 8.76* 
Sources X Subjects/B 244 2242 

*p «.001. 


factor analysis of a random sample of 100 subjects 
on the initial T test ratings in terms of relatively 
high and restricted loadings on the evaluative factor, 
and relevance to the four attitudinal objects under 
consideration. They included good-bad, foolish-wise, 
valuable-worthless, and successful-unsuccessful. The 
sum of the ratings across all four scales, adjusted for 
consistency of attitudinal direction, was used to index 
a single attitude score. 


RESULTS 


A necessary prerequisite for the analysis of 
change in the various source attitudes was the 
success of the intended experimental ma- 
nipulations of the concept attitude. Analysis 
of the Tı — To change scores on teaching 
machines revealed that both manipulation 
treatments apparently ‘‘took”—somewhat 
more so for the positive (mean change 
= +5.11, p< .02 by sign test) than the 
negative (—1.25, < .05) manipulations, 
with the difference between the two treat- 
ments being highly significant (¢ = 5.81, df 
= 122, p < .001). 

The first analysis on the main dependent 
variable of source attitude change was a re- 
peated measures analysis of variance of the 
respective change scores, the results of which 
are indicated in Table 1. There is no signifi- 
cant difference on both main effects—between 
the source-concept linkages, and between the 
concept manipulations—but there is a highly 
significant interaction between the two. These 
findings are perfectly in keeping with the 
congruity principle predictions: Within a 
given concept treatment, the different sources 
should show differential change, tending to 
cancel out one another. Similarly, comparing 
across the two concept treatments, we would 
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TABLE 2 


MEAN ATrITUDE CHANGE TOWARD VARIOUS SOURCES 
By TYPE or SOURCE LINKAGE AND CONCEPT 
MANIPULATION TREATMENTS 


Concept L n D » 

Eus (Maclay) | (Spence) | (Samuels) | Marzinals 
Positive 

(n=57) | 4.75, 4.08, 1.875. 3.57 
Negative 

(n=67) | 144, | $59, | 3.61, | 2.88 
Marginals 3.10 3.84 2.74 


Note.—Means with the same subscript are not significantly 
different from one another at the .05 level (by Dunn test, 
one-tailed), These comparisons apply only within a given row 
or within a given column, but not across both simultaneously. 


expect the sources to change in opposite direc- 
tions (at least for the L+ and L— linkages), 
again resulting in a canceling-out effect. Such 
circumstances would also lead to a significant 
interaction effect, as observed. 

The fit of the experimental data to the 
theoretical predictions can be more clearly 
realized through closer scrutiny of the various 
mean change scores, as presented in Table 2. 
One discrepancy that is immediately apparent 
is that while congruity theory predicts some 
negative shifts in the source attitudes—in 
the PL— and NL+ conditions—all mean 
attitude changes are favorable. However, 
predictions regarding relative attitude change 
may still be tested. Table 2 includes the re- 
sults of such analyses, summarizing the find- 
ings of the Dunn (1961) test for multiple 
comparisons between all possible pairs of 
sources within a concept treatment, and 
between treatments within a given source, 

Within the positive manipulation treatment, 
it is readily apparent that the differential 
change is as predicted for the L+ and L— 
sources, the former showing significantly 
(b < .01) more positive change. However, Lo 
also shows substantial change, not signifi- 
cantly less than L--. Similar findings prevail 
for the negative concept manipulation, with 
L— showing significantly (5 <.05) more 
favorable change than L+, as predicted, but 
again with Lo showing as much change as 
L—. Comparing within a given source linkage 
across the two concept treatments, there is 
the predicted greater favorable change on 
Maclay in the P than the N treatment 
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(p < .01), no difference on Spence, but con- > 
trary to the theoretical prediction, lack of a 
significant difference for the negative source 
linkage (Samuels). On this last comparison, . ' 
however, a £ test, also appropriate here, was 
significant (f = 1.98, df = 122, p < .05). 
These findings are further underlined by 
an alternative analysis in terms of how the 
individual subjects actwally changed on their 
concept attitudes—rather than how they were 
supposed to change—as a result of the ma- 
nipulation messages. That is, when the con- 
cept change scores were examined, it was 
noted that some subjects in both conditions 
indicated a change in attitude toward teach- 
ing machines in a direction opposite from that 
advocated by their respective messages. For 
example, a given subject in the P treatment 
may have for some reason changed in an 
unfavorable rather than the intended favor- 
able direction. In such cases, we would expect 
this subject to behave as if he were in the N 
rather than the P manipulation treatment. 
Table 3 presents the source attitude change 
means in terms of such a-division of subjects 
into the P and N treatments. Six subjects 
who showed no change in the concept were 
accordingly dropped from this analysis. The 
results are even more supportive of the main 
theoretical predictions: All means are some- 
what more pronounced in magnitude in ac- 
cord with the congruity predictions; indeed, 
there is now a negative, albeit slight, shift 
toward the postively linked source in the N 


TABLE 3 


MEAN Source ATTITUDE CHANGES WiTH SUBJECTS 
ALLOCATED TO EXPERIMENTAL TREATMENTS 
IN TERMS OF THEIR CONCEPT 
ATTITUDE CHANGE 


Source linkage 
Concept 
manipu- Marginals 
lation L+ Lo I 
(Maclay) | (Spence) | (Samuels) 

Positive 

(n—67) 5.49, 3.98, 1.16, 3.54 
Negative 

(n—51) | —.25, 3.52, 4.88, 2.72 
Marginals 2.62 3.75 3.02 


. Note.—As before, means with the same subscripts are not 
significantly different from one another at the .05 level, all by 
one-tailed Dunn test; p < .01 for all other differences. Again, 
comparisons apply only within a given row or column. 
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manipulation treatment. All significant dif- 


= ferences are beyond the .01 level by Dunn 


test, even that on L— between the P and N 
treatments, which failed to meet the .05 level 


“on the previous analysis. 


Discussion 


Despite some contrary findings, the results 
of this investigation provide support for the 
generalization of attitude change from con- 
cept to source in the communication situation 
employed. Differential degrees of change in 
source attitude did obtain as a function of 
differential source-concept linkages, as did the 
relative change on the same source as between 
different treatments of the concept. 

One of the unanticipated results was change 
toward the source in a favorable direction 
under certain conditions when a negative 
change was predicted. A possible explanation 
may be in terms of a general halo effect 
accruing to both the positively linked and 
negatively linked sources merely as a function 
of their being presented as recognized authori- 
ties. This setting was, after all, represented 
as a symposium spónsored by a major profes- 
sional association. To the undergraduates 
serving as subjects in the study, almost any 
psychologist taking an authoritative position 
on either side of an apparently significant 
professional issue in such a setting may be 
favorably regarded. 

Such an interpretation could also explain 
the other unpredicted, if less serious, finding 
of a relatively high degree of favorable atti- 
tude change toward the neutral source under 
both manipulation treatments. Just being a 
chairman of such a convention session might 
in itself be regarded—by our subjects, at 
least—as an honor to be accorded to only 
favorably recognized psychologists. In addi- 
tion, the linkage message identifying Dr. 
Spence as the symposium chairman also re- 
ferred to him as “one of the principal authori- 
ties on learning theory," and this may have 
been sufficient to produce the favorable 
changes noted. 

Such interpretations, it should be noted, are 
quite independent of any considerations of 
change occurring as a function of the gen- 
eralization of attitudinal modification. How- 
ever, the other findings—those more in accord 
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with the theoretical predictions—must be ac- 
counted for in terms of some generalization 
model. The principle of congruity served as 
the specific model here, but similar predic- 
tions may be derived from a number of the 
so-called “consistency models" (cf. Brown, 
1962). Generally, these models contend that 
the introduction of any inconsistency in a 
set of cognitive relationships sets up mecha- 
nisms which operate, directly or indirectly, to 
restore consistency. Such. mechanisms often 
involve the manifestation of change in cogni- 
tively related but not specifically mentioned 
objects. 

For Rosenberg (1960), for example, it is 
consistent (or “balanced,” to use his term) 
for hypnotically induced change in the affect 
of an object to lead to corresponding opinion 
changes in its cognitive components, and, in 
turn, in other related objects. Similarly, the 
basic notion in McGuire's (1960) approach 
is that people strive for consistency among 
opinions on logically related issues, thus ac- 
counting for the spread of opinion change 
from the experimentally manipulated premise 
of a syllogism to its unmentioned conclusion. 
In the present investigation, the focus is on 
a directed evaluative relationship between two 
attitudinal objects as a major dimension along 
which generalization may occur, but also 
in a direction that keeps the relationship 
psychologically consistent or congruous. 
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ROLE REWARD AND DISSONANCE REDUCTION * 
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In the context of an “experiment on debating,” Ss volunteered to defend a 
position with which they were in private disagreement. On the basis of ratings 
by audience members, Ss were told that the intellectual content of their speeches 
was superior (content reward) or that the manner in which they complied 
was superior (role reward). A 3rd group of Ss was given neutral feedback for 
both content and role, 2 control groups were employed in which Ss defended 
publicly the position with which they were in agreement. In the 1st control group 
(group opinion control) Ss received neutral feedback for both content and 
role while Ss' opponents were rewarded strongly for the content of their 
speeches. In the 2nd control condition (control), Ss and their opponents re- 
ceived neutral feedback for both content and role. Role-reward Ss showed 
greater attitude change in the direction of their publicly stated opinion than 


did all other groups. 


Since the introduction of experiments by 
Janis and King (1954; King & Janis, 1956) 
on the effects of forced compliance upon atti- 
tude change, considerable attention has been 
directed to the examination of situations in 
which subjects are either forced or induced by 
varying amounts of an incentive to behave in 
a manner which is contrary to their private 
opinions. A variety of experiments dealing 
with opinion change in this setting have been 
stimulated by Festinger's (1957) theory of 
cognitive dissonance. While numerous experi- 
ments have dealt with variation in amount 
of justification prior to compliance, few have 
dealt with the effects of reward following such 
behavior, An interesting experiment by Freed- 
man (1963) has suggested that the point at 
which justification is given may be critical. 
Variation in the magnitude of incentives prior 

*to complying typically produces a reversal of 
expected reward magnitude effects, that is, 
small incentives produce greater effects than 
larger incentives prior to compliance (e.g. 
Festinger & Carlsmith, 1959). On the other 
hand, variation in justification affer compli- 

1This research was supported by Grant MH 
07835-01 from the National Institutes of Health, 
United States Public Health Service, to Leon 
Festinger, Stanford University. Pilot work for the 
experiment was supported by funds from the Gradu- 
ate School, Stanford University. 

?The author is indebted to Leon Festinger for 
his advice and encouragement. In addition, the 
author wishes to express his appreciation to Jonathan 
Freedman and Merrill Carlsmith both of whom 
contributed valuable critical comments. 


ance does not produce this curious reversal 
(Freedman, 1963). 

While studies by Goldstein and McGinnies 
(1964) and Scott (1957, 1959) have been 
concerned with the effects of reward following 
forced compliance upon subsequent opinion 
change, the mechanisms underlying change 
remain unclear. The notion that reward pro- 
vides “environmental supports" for a changed 
cognition seems somewhat limited in that it 
concentrates upon one rather narrow aspect of 
the informational potential of a rewarding 
event. When reward is administered in the 
form of a single undifferentiated verbal or 
nonverbal sign of approval following a com- 
plex sequence of behavior, a rewarded subject 
can have little differentiated information about 
his performance, He has little knowledge of 
what aspect or aspects of his behavior led to 
receipt of reward. 

The present research stems from a concep- 
tion of reward as informative. Reward 
provides the subject in a forced-compliance 
setting with information about the impact of 
specific aspects of his behavior upon others. 
Such information is useful to the subject in 
interpreting his behavior. By providing the 
subject with differential information, that is, 
by rewarding him for different aspects of his 
performance, it may be possible to provide 
him with alternative ways to construe his 
behavior in a forced-compliance setting. Dis- 
sonance is generated when a cognition about 
one's beliefs is incompatible with a cognition 
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concerning one's behavior. Hence, the inter- 
pretation that the subject places upon his 
behavior may operate to either increase or 
reduce dissonance. In the present research, 
two obvious aspects of performance in a 
voluntary-compliance setting were selected for 
study. In an experimental context consisting 
of an "impromptu debate" subjects were re- 
warded after having volunteered to defend 
and having defended a position contrary to 
their private beliefs. Following compliance, 
they either received information suggesting 
that the actual intellectual content of their 
speeches was perceived by audience members 
as superior or information that their role- 
playing ability (the manner in which they 
gave their speeches) was considered superior 
by audience members. 

A pilot study conducted upon 40 subjects 
indicated that quite different effects could be 
expected for content-reward subjects and sub- 
jects rewarded for their “role-playing ability.” 
In contrast to predictions made at that time, 
role-reward subjects showed striking changes 
in attitudes while content-reward subjects 
failed to show changes greater than a volun- 
tary compliance-neutral feedback condition in 
which subjects received neutral feedback for 
both content and role behavior. The pres- 
ent research was conducted to explore this 
rather intriguing and unexpected finding more 
thoroughly, 


MzrHOD 
Subjects 


Experimental subjects consisted of 100 paid volun- 
teers, male and female, from introductory psychology 
classes at a California junior college, In each of the 
experimental conditions and the control group, there 
were 20 subjects, 7 with initial attitudes opposed 
to capital punishment and 13 with initial attitudes 
in favor of capital punishment. The subjects ranged 
in age from 17 to 40 and were quite heterogeneous 
with regard to such characteristics as social class, 
occupational endeavor, etc. Of the original 100 sub- 
jects sampled, 5 proved unusable and were replaced 
by subjects drawn from the same subject source. 
Of the 5 subjects who were replaced, 2 were speech 
phobics and 3 refused to speak against their posi- 
tions. The replacement subjects were chosen on the 
basis of pretest scores which were highly similar to 
those of the subjects they replaced. The ratio of 
males to females within conditions was approxi- 
mately equal. 
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Materials 
Attitude measurement. Items from the Thurstone 


Attitude toward Capital Punishment scale (Thur- - 
stone & Chave, 1929) were arranged in pairs to form | 


a 12-item Likert scale. Each item consisted of two 
alternatives, A and B, between which the subject 
was required to choose by circling one of five pos- 
sible responses as follows: A= Statement A is en- 
tirely preferred to Statement B as being descriptive 
of my attitude; B — Statement B is somewhat pre- 
ferred over Statement A; a — Statement A is some- 
what preferred to Statement B; b= Statement B 
is somewhat preferred over Statement A; ? = cannot 
choose between Statements A and B. Items were 
scored 5, 4, 3, 2, or 1. Minimum and maximum 
scores were 12 and 60, respectively. 

The pairs of items were constructed such that 
for half of the items circling A indicated favorable 
attitudes toward capital punishment while for the 
other half, circling B indicated favorable attitudes 
toward capital punishment. In this fashion, it was 
possible to assess possible response sets attributable 
to item form. The correlation between A+ and B+ 
items was .90. A correlation of this magnitude sug- 
gests that subjects were responding to item content 
rather than item form. The split-half reliability of 
the test was .95. The Attitude toward Capital Pun- 
ishment scale was imbedded in a series consisting 
of seven other scales of a similar nature which 
measured different attitudes. 


Experimental Design 


Subjects were matched by quintuplets on the basis 
of pretest scores and then assigned randomly to 
one of three experimental conditions or two control 
conditions. Since the pilot study indicated that 
initial attitude was likely to prove a significant 
source of variability, a 2 X 5 experimental design 
with initial attitude (for or against capital punish- 
ment) by experimental treatments was employed. 
A description of the experimental and control groups 
follows: 

Content reward. Subjects volunteered to defend 
publicly a position with regard to capital punishment 
with which they privately disagreed. At the con- 
clusion of the debate and following ratings by 
accomplices, these subjects were told that the actual 
arguments they raised in defense of their publicly 
stated opinion were perceived by the audience (ac- 
complices) as constituting extremely powerful, log- 
ical and persuasive communications. While content- 
reward subjects were told that their actual arguments 
were superior, they were also told that the manner 
in which they gave their portion of the debate was 
perceived by the audience as comparable to that of 
the average college student. The subject's opponent 
in the debate (an accomplice of the experimenter) 
was told that he was comparable to the average 
college student both in the actual content of his 


3 Copies of this scale are available from the 
author. 
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speech as well as in the manner in which he gave 


his speech. 

Role reward. Subjects volunteered to defend 
publicly a position with regard to capital punish- 
ment with which they privately disagreed. At the 


' conclusion of the debate and following the ratings 


by accomplices, these subjects were told that while 
the actual content of their speech was considered 
comparable to that of the average college student, 
the manner in which they gave their speeches was 
considered superior by the audience. The subject's 
opponent in the debate $was told that he was com- 
parable to the average college student in both 
content and manner. 

Neutral feedback. Except for the feedback about 
performance, this condition was identical to the con- 
tent- and role-reward conditions described above. 
In the neutral feedback condition, both the subject 
and his opponent were told that the audience per- 
ceived them as quite comparable to one another as 
well as to the average college student in both content 
and manner. 

Group opinion. This condition was one of two 
control conditions. The subject volunteered to 
defend publicly the position with which he was in 
agreement. However, at the conclusion of the debate, 
the subject was given neutral feedback on both 
content and manner while his opponent was told 
that he was perceived by group members as present- 
ing a powerful, logical, and persuasive argument. 
Hence in this condition, the subject was led to believe 
that group opinion was running strongly against 
his position. The group opinion condition was in- 
tended as a control for the effects of social influence. 

Control. A nontreatment control group was con- 
sidered inappropriate for this experiment. Control 
subjects constituted the second non-forced-compliance 
group. They volunteered to defend publicly a posi- 
tion with which they weré in agreement as measured 
by the pretest. Both the control subject and his op- 
ponent received neutral feedback with regard to 
both content and manner. In the present experi- 
ment, the control group should show the combined 
operation of several possible threats to experimental 
design. These are regression effects, the influence of 
the opponents’ arguments, temporal stability of the 
measuring instrument, and possible confounding by 
history (extraexperimental events which might pro- 
duce significant changes in attitudes). 


Procedure 


Four weeks prior to the first experimental sessions, 
subjects were pretested on the Attitude toward 
Capital Punishment scale. The Attitude toward 
Capital Punishment scale was imbedded in a series 
of seven other attitude scales of similar form but 
different content. A rational cutting score was em- 
Ployed to divide subjects into two groups, those 
initially in favor and those opposed to capital punish- 
ment. Since a competely undecided subject would 
achieve a score of 36 on the attitude scale em- 
Ployed, this score was taken as the dividing point. 
The pretest means for the various conditions were as 
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follows: content, 38.35; role, 38.75; neutral feedback, 
39.19; group opinion, 39.10; and control, 38.40. At 
the time of pretesting, subjects were told nothing 
more than that they were being asked to participate 
in a psychological experiment and would be paid $2 
for their participation. Subjects were pretested in 
two groups within their classrooms, The experi- 
menter was present during the pretesting. In order 
to provide a further control for confounding by 
naturally occurring extraexperimental effects, sub- 
jects were systematically assigned to data gathering 
days. In this fashion, equal numbers of subjects had 
been run in each of the experimental groups and 
the control group at the close of each week of data 
gathering. Considering the sensitive nature of the 
debate topic, capital punishment, such a procedure 
was considered most appropriate. 

Each subject met in a group consisting of the 
experimenter and three accomplices. The accomplices 
consisted of two boys and a girl, college students on 
summer vacation from schools other than the one 
from which the subjects were obtained. The ac- 
complices were carefully trained to present their 
speeches either for or against capital punishment 
in a manner one would expect in an impromptu 
debating setting. They were given speeches for and 
against capital punishment which they memorized 
and presented as arguments against the subject. 
Thus, the actual arguments, either for or against 
capital punishment were held constant across ex- 
perimental conditions. 

Accomplices were instructed to appear attentive 
throughout the debate but to avoid any signs of 
approval such as head-nodding, smiling, etc. This 
condition was fulfilled by the accomplices. For most 
experimental subjects, the accomplices were aware of 
their initial attitudes. Since some subjects volunteered 
along with the accomplices to speak in favor of the 
first mentioned position, it was possible for the 
accomplices to become aware of the subjects true 
opinion. However, accomplices were unaware of the 
reward condition to which the subject had been 
assigned until the close of the experiment when 
feedback about performance was given. Some sub- 
jects felt it necessary to announce to the group 
what their “true” opinion was prior to participation 
in a voluntary compliance condition. Unfortunately, 
no effort was made to identify these subjects in the 
data analysis. The one female accomplice always 
opposed female subjects in the debate with the two 
male accomplices serving as the audience. One of the 
male accomplices always opposed male subjects in 
the debate while the other male accomplice served 
only as an audience member. Thus, for male sub- 
jects, the audience consisted of one female and one 
male accomplice. 

The experimenter was present throughout the 
experimental session. During the debate, the ex- 
perimenter maintained an air of quiet objectivity. 
The experimenter did not show signs of approval 
nor did he look at either of the debaters during the 
debate. The author served as experimenter. 

The following instructions were read: 
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I have selected you four people from different 
classes here at College for a reason. This 
was done to eliminate the possibility that persons 
who know one another quite well would be in the 
same experimental session. If any of you do know 
some other person here quite well I'll have to 
postpone this session and mix you with other 
students. [At this point, the experimenter paused 
and waited while the subject and the accomplices 
looked at one another.] OK. This is an experiment 
in impromptu debating. The topic of the debate 
is capital punishment. Two of you will give 
speeches about capital punishment, one will speak 
for and one will speak against. 

Now, who is willing to speak for [against] 
capital punishment? [The experimenter's ac- 
complices had been previously instructed to volun- 
teer for the first position requested by the ex- 
perimenter. The appropriate accomplice was al- 
ways selected.] Alright, Mr. has agreed 
to speak for [against] capital punishment. Who 
is willing to volunteer to speak: against [for] 
capital punishment ? 


The accomplices had been previously instructed 
not to volunteer for the second position requested 
by the experimenter, In order to avoid giving the 
subject justification sufficient to reduce dissonance, 
the experimenter simply looked at the subject and 
the two accomplices and waited. After periods of 
time ranging from a few moments to several minutes, 
subjects volunteered to defend the position with 
which they privately disagreed. The subject was 
not urged, directly requested, or forced to defend a 
position opposite that of his private beliefs. With 
apparently minimal justification from the experi- 
menter, subjects volunteered. As mentioned previ- 
ously, three subjects refused to comply. No attempt 
was made by the experimenter to force these subjects 
to comply. They were simply dismissed from the 
experiment after the experimenter explained the 
nature of the research. After the subject had: volun- 
teered to speak either for or against capital punish- 
ment, the instructions continued as follows: 


Alright then, Mr. will speak for [against] 
and Mr. will speak against [for] capital 
punishment. You other two people are to listen 
carefully to the speeches as I shall require some- 
thing of you at a later point. Mr. — — , let's 
ub with you [the accomplice always spoke 


After the speeches were presented, the instructi 
continued as follows: : iH 


Now that the speeches have been presented, I 
want you two, the audience, to fill out these 
rating scales. The first scale is a rating of the 
actual content of the person's speech. On this 
scale, you will concern yourself with the argu- 
ments raised by the speaker in defense of his 
position. Vou will judge how reasonable, logical, 
coherent, and persuasive these arguments appeared 
to you. The second scale is a measure of the 
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manner in which the person gave his speech. Here 5 


you are to ignore the actual content of the speech 
and concentrate entirely upon the manner in 
which the person delivered his speech. Read the 


instructions carefully and rate the speakers on, 


these two rating sheets. 


While the subject and his opponent were being 
rated by the accomplices, they were given rating 
sheets for content and manner so that they “could 
see for yourselves what it is that you are being 
rated upon.” * The experimenter “scored” the ratings 
in full view of the subject. The instructions continued 
as follows: 


Here are the results of the ratings. Now, you 
were rated on two separate things, the content 
of your speech and the manner in which you 
gave your speech. In one way, this situation is a 
bit comparable to a play and acting. First, we 
have the actual content of the actor’s speeches to 
his audience. The content is the message a play- 
wright means to get across to his audience. . . 
these are the ideas of the play if you will. 
Secondly, we have the manner in which the actor 
performed his role . . . his acting ability. 


Feedback to the subject about performance in the 
role-reward group was as follows: 


In terms of content, your fellow students 
rated you at mean in comparison with other 
college students. In other words, considering the 
extent to which the raters saw your arguments 
for [against] capital punishment as constituting 
powerful, convincing, logical, reasonable, and per- 
suasive arguments, you were seen as being com- 
parable to the average college student. In terms 
of your debating style, your fellow students 
rated you quite highly. The scores you received 
indicate that you fell at the 99% among college 
students on the manner in which you gave your 
speech. Thus, in terms of the actual content of 
your speech . . . the forcefulness, logic, and 
persuasiveness of your arguments, you were 
average while the manner in which you gave your 
speech was considered superior. In other words, 
you are quite a good actor.5 


The Attitude toward Capital Punishment scale 
was readministered after the feedback conditions 
had been imposed. Upon presentation of the attitude 
scale, subjects were told, “while this may appear to 
be similar to something you have done for me 
before, be extremely careful as it is not identical.” 


*Copies of the rating scales employed by the 
accomplices are available from the author. 

5 Feedback about performance to experimental 
subjects in other groups and to accomplices Was 
varied appropriately. All subjects were told they 
were rated “average” or “superior” on content or 
manner or “average” on both depending upon the 
condition to which they were assigned. Description 
of the various feedback conditions is given earlier 
in the Procedure section, 
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d 
uii 
Initial attitude (A) | 359.48 
Treatments (B) 1078.10 
AXB 713.50 
æ Within (error) 4220.41 
*b <01. 
"p $m. 
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TABLE 1 
MEANS AND SDs FOR ATTITUDE-CHANGE SCORES 
Condition 
Initial attitude Content Role Neutral Group opinion Control 
M SD M SD M SD M SD M SD 
Against (N =35) 86 6.60 —1414 3.85 3.57 7.25 00 6.27 —2.71 sh 
For (N —65) 4.23 7.64 1185 | 10.15 4.77 6.78 1.23 5.89 —3.92 | 4.13 


Questioning of the subjects following the experiment 
indicated that virtually all subjects felt that the 
posttest was not the same as the pretest which had 
been administered to them in their classes. 


RESULTS 


The data for the analysis consisted of 
prepost difference scores. An analysis of vari- 
ance, Initial Attitude X Treatments, was con- 
ducted (Winer, 1962, pp. 241-244). Means 
and standard deviations for the attitude- 
change scores are given in Table 1. The re- 
sults of the analysis of variance are presented 
in Table 2. 

As suspected frbm the pilot study, initial 
attitude proved a significant source of vari- 
ability. In this experiment, it was clearly 
easier to change attitudes held by subjects 
who were originally in favor of capital punish- 
ment. Subjects who were initially opposed to 
capital punishment showed little attitude 
change. The significant main effect for treat- 
ments indicates that differential change was 
obtained among the various experimental and 
control groups. The significant Initial At- 

. titudes x Treatments interaction obtained ap- 
peared to be largely attributable to the scores 
for the role-reward group. 

Following the overall analysis of variance, 
individual comparisons among the treatment 


TABLE 2 
SUMMARY OF ANALYSIS OF VARIANCE ON 
ATTITUDE-CHANGE SCORES 
Source SS F 
7.67* 
5,95* 
3.80* 


means were conducted. Since none of the dif- 
ferences among the means for subjects holding 
initial attitudes opposed to capital punish- 
ment were significant, the following results 
are based upon subjects whose initial at- 
titudes were in favor of capital punishment. 
Table 3 presents a summary of individual 
i tests based only upon data from subjects 
whose initial attitudes were in favor of capital 
punishment. 

As indicated in Table 3, all of the experi- 
mental groups differ significantly from the 
control group. Group-opinion subjects, the 
control for social influence, did not differ 
significantly from control subjects. 

The role-reward group differed significantly 
from the content-reward, neutral-feedback, 
and group-opinion conditions. This would 
indicate that the effects of rewarding role be- 
havior appear greater than those expected by 
content reward, dissonance reduction (neutral 
feedback), or social influence (group opinion). 

None of the possible comparisons among 
the content-reward, neutral-feedback, and 


TABLE 3 


SUMMARY OF INDIVIDUAL COMPARISONS FOR SUBJECTS 
HOLDING INITIAL ATTITUDES IN FAVOR OF 


CAPITAL PUNISHMENT 
Comparison Dm t 
Role-control 15.77 5.86** 
Content-control 8.15 3.03* 
Neutral-control i M aie 
Group opinion-contro! K ^ 
Role-content 7.62 2.83* 
Role-neutral 746 2.66* 
Role-group opinion 10.62 3.95 
Content-neutral — 54 
Content-group opinion 3.35 
Neutral-group opinion 3.00 
=b b 
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group-opinion means are significant. While 
the neutral-feedback condition did not pro- 
duce significantly greater changes in at- 
titudes than a control for the effects of social 
influence, neither did the content-reward 
group which presumably represented the joint 
operation of dissonance reduction and social 
influence. However, the effects of social in- 
fluence alone did not produce significant 
changes in attitudes in comparison with the 
control group. 

It would appear, then, that social influence 
in and of itself did not produce significant 
changes in attitudes in this research, The joint 
operation of dissonance reduction and social 
influence in the content-reward condition did 
not produce changes greater than those ex- 
pected by dissonance reduction alone. On the 
other hand, the results for the role-reward 
condition clearly indicate greater changes 
than would be expected from the operation of 
either social influence or dissonance reduction 
separately or jointly. 


Discussion 


As is the case in most complex experiments, 
a variety of theoretical interpretations can be 
brought to bear upon the obtained data. How- 
ever, since this experiment was suggested by 
Festinger's (1957) theory of cognitive dis- 
sonance, it would appear appropriate to dis- 
cuss the findings in light of dissonance theory 
first. Alternative plausible explanations will 
be discussed together with possible limita- 
tions in the present experimental design. 
While it should be noted that the present 
research was suggested by dissonance theory, 
it can hardly be said to have been derived 
systematically from dissonance theory. In the 
following discussion, the results for those 
initially holding attitudes in favor of capital 
punishment are considered. Apparently, sub- 
jects who were initially opposed to capital 
punishment held such attitudes so strongly 
that all attempts to influence them failed. 

Dissonance is thought to exist when two 
cognitions held by the individual are in- 
compatible. In the present experiment, the 
cognition held about one's belief concerning 
capital punishment was dissonant with the 
cognition held about one's public behavior. 
It was important to note that dissonance is 


Jonn WALLACE 


not generated between a cognition and be- 


havior but between a cognition concerning 
beliefs and a cognition concerning behavior. 
Another way of stating this is simply that 
dissonance is created when a belief is in- 
compatible with an interpretation of one’s 
behavior. The interpretation one places upon 
his behavior in a situation designed to produce 
dissonance may either reduce or increase the 
amount of dissonance generated and as a con- 
sequence, the amount of dissonance reduction 
which takes place. Now, it would appear 
logical to assume that the interpretation one 
places upon his behavior is, in part, a func- 
tion of the impact of that behavior upon 
others. Reward in the form of positive feed- 
back from others about some aspect of per- 
formance constitutes information useful to the 
individual in constructing interpretations of 
his behavior. Such interpretations placed upon 
one’s behavior may serve to increase or re- 
duce dissonance. In the present experiment, 
it is assumed that providing subjects with 
feedback that their role-playing ability proved 
superior operated in some fashion or another 
to increase the total amount of dissonance 
generated. Telling subjects that the manner 
in which they complied was superior resulted 
in greater attitude change than telling them 
their actual intellectual ideas were superior 
or providing them with neutral feedback on 
both variables. Why is this so? One possibility 
is as follows: 

If the way in which the subject construes 
his behavior is, in part, a function of the im- 
pact of his behavior upon others, what im- 
plications might a role-reward subject draw. 
from the feedback from the audience? Inform- 
ing the subject that the manner in which he 
complied was perceived by audience members 
as superior is equivalent to informing him 
that others perceived his actual behavior as 
exemplary role behavior for a person who 
actually possesses the subject’s publicly stated 
opinion. In other words, the impact of the 
subject’s behavior upon others is such as to 
suggest sincere commitment on the part of 
the subject to the publicly stated opinion. 
Hence, strong dissonance is produced with 
subsequent marked dissonance reduction in 
the form of attitude change. 

A considerably more internal view of this 
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> process is suggested by the following remarks 
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obtained from a role-reward subject in an 
interview following the experiment: 


I tend to think of myself as a honest and sincere 
person, When you told me that the others con- 
sidered me “a very good actor,” I was somewhat 
baffled . . . actually, a little offended. The more I 
thought about it, the more I became convinced that 
what I had said in the debate was what I truly 
believed. ° 


This subject changed 25 points away from 
his original position completely reversing his 
position on the issue of capital punishment. 
This example raises the intriguing possibility 
that maximal dissonance may be generated by 
pitting interpretations about one’s behavior 
against relatively stable personal constructs 
(Kelly, 1955). In this case, it would appear 
that informing the subject that his role-play- 
ing ability was superior led him to interpret 
his behavior in the setting as evidence of 
“insincerity” on his part. A construction of 
himself as an “insincere” person was markedly 
dissonant with a rather strong personal con- 
struct, "sincere" person. In order to maintain 
a consistent view of himself along a rather 
important dimension, he apparently had no 
other choice than to change his opinion about 
capital punishment. 

Rewarding the actual intellectual content 
of the individual's participation in this set- 
ting, produced no attitude change over and 
above that expected by either dissonance 
reduction alone or social influence. Why is 
this so? One very real possibility is simply 
that intellectual ideas are “transpersonal.” 
"Subjects have many sources from which to 
draw such ideas. Discussions of the pros and 
cons of capital punishment are available in 
the mass media. While the subject was forced 
to “think up” his own arguments, it seems 
highly likely that he simply verbalized the 
"hack" arguments for or against capital 
punishment. Numerous subjects indicated in 
interviews after the experiment that this in- 
deed was the case. Typical remarks were as 
follows: “I just talked about some of the 
things I read in a newspaper article recently,” 
“T remembered some of the arguments that 
were raised in a debate I was in in high school,” 
etc. In essence, rewarding the actual intel- 
lectual content in this experiment failed to 
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increase dissonance because such ideas are, 
in effect, “public domain" and not the ex- 
clusive personal productions of the subject. 
Hence, the subject could verbalize such ideas, 
receive reward for the reasonableness, logical 
nature of, and persuasiveness of such argu- 
ments but experience little dissonance over 
and above that expected from simple volun- 
tary compliance. 

In discussing alternative explanations for 
the obtained findings, the “mental rehearsal” 
hypothesis introduced by Janis and King 
(1954) immediately comes to mind. This 
notion holds that attitude change in forced- 
compliance settings results from increased 
availability and saliency of opposing argu- 
ments, that is, the individual is "impressed 
by his own cogent arguments, clarifying il- 
lustrations, and convincing appeals which he 
is stimulated to think up in order to do a 
good job of “selling” the idea to others 
[Janis & King, 1954, p. 218]." While this 
factor of mental rehearsal and increased 
availability of ideas may indeed be operative 
in this research, it is difficult to explain why 
it should be more operative in one of three 
feedback conditions and less operative in the 
other two. 

It might be argued that it is not increased 
availability per se which produces change 
but the fact that the ideas are actively self- 
produced and elaborated. This seems to be 
closer to the position of Janis and King 
(1954). Again, the obtained differences among 
the role, content, and forced-compliance con- 
ditions renders this argument implausible. 
One would have to assume that subjects in 
the role-reward condition were more actively 
involved in thinking up ideas than subjects 
in the other two conditions. This argument 
appears highly implausible in that subjects 
in all three groups were required to generate 
their arguments. On the other hand, the 
present results can be considered in partial 
agreement with Janis and King’s second 
hypothesis, concerning “satisfaction with 
one’s own performance [p. 216].” It is pos- 
sible that awareness of success in role playing 
increases the subject’s involvement in the 
role. And the greater the involvement of the 
subject, the more one would expect him to 
strive for self-consistency. 
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The results of this research suggest that 
dissonance may be reduced or intensified by 
the particular interpretations placed upon his 
behavior by the subject in a setting in which 
his public behavior is discrepant with his 
private opinion. Reward may be construed as 
differential information about the impact of 
the subjects behavior upon others and, as 
such, it is useful to the subject in construct- 
ing interpretations of his behavior. These 
interpretations may be discrepant with more 
stable and enduring personal constructs held 
by the subject. Hence, the total amount of 
dissonance generated may be increased. 
Experiments in which the subject is led to 
reconstrue his behavior in a forced or volun- 
tary compliance setting seem a logical first 
step in developing experimental verification 
or disconfirmation for the above admittedly 
speculative thought. 
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WORD-ASSOCIATION RESPONSES: 


COMPARISONS OF AMERICAN AND FRENCH MONOLINGUALS 


WITH CANADIAN MONOLINGUALS AND BILINGUALS * 
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Word-association norms were developed for student groups of English- and 
French-Canadian monolinguals, and Canadian English-French bilinguals re- 
sponding once in English and once in French. These were compared with pub- 
lished norms for American and European French students. It was found that: 
(a) The French-Canadians resemble the French-French in their associational 
response diversity and the English-Canadians resemble the more stereotyped 
response pattern of the Americans. Depending on the language of response, the 
bilinguals respond either in the American or French fashion. (b) There was 
a marked degree of response content equivalence between the American and 
English-Canadian samples and relatively little between the French-Canadian 
and the other monolingual groups. The bilinguals had intermediary degrees 
of response similarity suggesting that they could be effective as linguistic 
mediators in the various communication systems. (c) The French-French Ss 
use superordination less than all the other groups. The findings are discussed in 


terms of theories of cultural differences, bilingualism, and meaning. 


As more word-association norms from dií- 
ferent parts of the world become available, it 
is possible to make, comparisons, both within 
and between language communities, that lead 
to intriguing questions about changes through 
time and cultural differences in associational 
responses, For example, a standard list of 
stimulus words (the Kent-Rosanoff list) has 
been used with samples of English-speaking 
Americans since 1910. Jenkins and Russell 
(1960) noted that Americans increased the 
use of common or popular responses between 
1910 and 1952 and changed in certain respects 
the content of their associational responses, 
although differentially since the less popular 
ones were those that changed most. Further- 
more, during the time period there was à de- 
crease in the use of superordinate responses 
(eg., table-furniture), Jenkins and Russell 
argued that both the increase in the use of 
popular responses and the decrease in super- 
ordination over 40 years were likely due to an 
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increased sophistication with short-answer- 
type questions. They chose this over an al- 
ternative interpretation: that American lan- 
guage habits may have become more stand- 
ardized because of the growth of mass com- 
munications, 

In 1918, Esper (1918) translated a list of 
German stimulus words into English and 
found many similarities between the responses 
of German and English subjects with regard 
to reaction time and types of responses given. 
More recently, Rosenzweig (1957) compared 
the associational responses of groups of Ameri- 
can and French students and noted that the 
French group gave more diversified responses 
than the Americans did and that the two 
groups had equivalent primaries (most popu- 
lar responses) in only about 50% of the 
possible cases. It is of interest that the French 
students gave many less superordinate re- 
sponses than did the Americans, indicating 
that the frequency of use of popular responses 
is not systematically related to superordina- 
tion, as one might have inferred from changes 
over time mentioned in the Jenkins-Russell 
study. Rosenzweig attributed the French tend- 
ency toward associational diversity to the 
greater stress placed on individuality in 
French education. He did not speculate about 
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the comparatively infrequent use of super- 
ordinates among the French. 

Later, Rosenzweig (1961) examined the 
primary responses of comparable groups of 
French, Italian, German, and American sub- 
jects and found that even in this case the 
‘American group was distinctive in its marked 
tendency to use common responses. All four 
groups gave the same primary responses to 
only 21 out of the 89 possible cases, and the 
American, French, and German primaries 
agreed in only 36 out of 100 cases. Variations 
in the degree of overlap of primary responses 
have also been noted among subgroups from 
the same linguistic community by Rosenzweig 
(1964). He compared French students and 
workers with comparable groups of Americans 
and found that the worker-student response 
differences were greater for the French sub- 
samples, suggesting that education and social- 
class differences may affect modes of associat- 
ing or types of associations given in certain 
cultures more so than in others. 

In the present study, comparisons of asso- 
ciational responses are extended to include 
three distinctive social groups living in one 
geographical community, two of them mono- 
lingual and one bilingual. The setting is the 
Province of Quebec and the groups are Eng- 
lish-Canadian and French-Canadian mono- 
linguals and Canadian English-French bilin- 
guals. The purpose of the investigation is to 
compare the associational responses of English- 
Canadians and Americans, making it possi- 
ble to examine critically the belief some Eng- 
lish-Canadians hold that they have a distinc- 
tive language and culture; to compare the 
responses of French-Canadians, Americans, 
English-Canadians, and French students from 
France in order to evaluate the conflicting 
claims that Canadian French is largely Ameri- 
canized, that French-Canadians are a linguisti- 
cally and culturally isolated group, or that 
they are North American representatives of 
European French language and culture; and 
to examine the responses of English-French 
bilinguals from Quebec who may or may not 
be linguistically dependent on the English and 
French monolingual groups and who may or 
may not be effective when acting as communi- 
cational liaisons between them. 

Attention will be given to three features of 
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the various groups’ responses: their distribu- 
tion, that is, the degree of group stereotypy or 
dispersion of responses; their content, that is, 
the degrees of equivalence of specific responses; 
to the same or translated equivalent stimulus, 
including analyses of all responses given as 
well as the primaries; and one aspect of the 
form of the responses, namely, variations in 
the use of superordinates. 


METHOD 


The Kent-Rosanoff stimulus-word list was trans- 
lated for French-Canadians by a panel of Canadian 
French-English bilinguals, working from both the 
English and French versions. The final form was 
different for 8 of the 100 stimuli from Rosenzweig’s 
(1957) version used for French students. These 
changes make the stimuli, even when translated, as 
equivalent as possible for all groups. Comparisons 
are based on the 100 stimuli except when otherwise 
noted. According to the bilingual translators, the 
stimuli appeared to be as appropriate, that is, as 
common and unemotional in meaning, in their 
French as in their English versions. 

The subjects were 136 male French-Canadian 
monolingual students from a collège (the equivalent 
of a French lycée or an American junior college) in 
Quebec city; 206 male and female English-Canadian 
monolingual advanced students in Montreal high 
schools; and 88 English-French bilinguals, males and 
females, from one high school and two collèges in 
Montreal. Approximately half the bilinguals received 
the English version first and the Canadian French 
version about 3 weeks later; the other half started 
with the French version, Responses of males and 
females were tabulated together. 

The criteria for bilinguality included a self-evalua- 
tion indicating a fairly good or good degree of skill 
in speaking, reading, writing, and understanding the 
second language, and completion of at least 90 of 
the 100 responses in both languages. Monolinguals 
were those who had very little or no skill in the 
second language as reflected in the same measures. 
Response frequencies were computed with a varying 
number of subjects due to incomplete or illegible 
answers. 

The list of 100 words was given as a group test, 
and subjects were asked to write down immediately 
the first word that came to mind as they read each 
stimulus word. They were also instructed to work 
rapidly through the list. 


RESULTS 
Associational Response Distributions 


In Table 1 the four Canadian groups are 
compared with the American and French- 
French student norms in terms of the mean 
percentages (over the 100 stimulus words) of 
subjects who contribute to the primary, sec- 
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3 TABLE 1 i 
RESPONSE STEREOTYPY OF VARIOUS GROUPS OF STUDENTS, IN MEAN PERCENTÀGES 


" English- Bilinguals French- ili: F p 

American | Canadian | in English | Canadian ` Rime |. French 
1964 1964 1964 1964 1955> 

Primaries 37.5 33.3 25.9 23.7 23.0 20.4 
Secondaries 13.6 11.5 11.7 11.2 10.3 9.8 
Tertiaries 8.0 74 1.2 7.5 6.9 6.9 
Total 59.1 52.2 44.8 42.4 40.2 37.1 

Median percentage of primaries 34 a E23 20 19 48 
Rank-frequency slopes —14 —13 —14 —1.0 —14 —1.0 


a From Russell and Jenkins (1954). 
b From Rosenzweig (1959). 


ondary, tertiary responses, and the average of 
these three most popular responses combined.* 

It is apparent that the American group 
shows the most stereotypy in responding and 
the French-French group least, with the Eng- 
lish-Canadians being notably like the Ameri- 
cans and the other three groups placing pro- 
gressively closer to the French-French. The 
bilingual group presents different distributions 
when using the two languages, behaving like 
the English-Canadians when responding in 
English and more like the French-Canadians 
and French-French when responding in 
French, The same pattern is reflected in the 
“median percentage” of primaries presented 
in Table 1. 

Rosenzweig (1959) used a “rank-frequency 
function” to express differences in the tend- 
ency to diversify associational responses in 
another fashion, For this measure, the mean 
frequency of all primaries becomes the fre- 
quency for Rank 1, the mean frequency of all 
secondaries becomes the frequency for Rank 
2, and so on, and these values are plotted on 
the abscissa of log-log coordinates with the 
frequencies of response distributed on the 
ordinate, These curves are approximately 
linear for sizable samples; the steeper the 
slope, the more stereotyped the response pat- 
tern is. Rosenzweig found that the American 
slope was —1.4 and the French-French slope 


3 Tables giving the primary responses and per- 
centage of students for each of the Canadian groups 
have been deposited with the American Documenta- 
tion Institute. Order Document No. 8734 from ADI 
Auxiliary Publications Project, Photoduplication 
Service, Library of Congress, Washington, D. 
20540. Remit in advance $1.25 for microfilm or 
$1.25 for photocopies and make checks payable to: 
Chief, Photoduplication Service, Library of Congress. 
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—1.0, It will be noticed in Table 1 that the 
slope for the English-Canadian responses ap- 
proximates closely that of the Americans while 
those of the other three groups are essentially 
alike and similar to that of the French-French. 


Equivalence of Associational Responses 


For 20 of the stimulus words, all six of the 
norm groups produce the same (or translated 
equivalent) response as a primary. In 15 of 
these instances, the primaries are above the 
median percentage of all primaries for the 
group, showing a tendency for responses with 
a high probability of occurring to make up a 
core, albeit small, of common associational 
responses for these language groups. The 
stimuli (in small capitals) and responses in 


question are: TABLE-chair, MAN-woman, 
BLACK - white, FRUIT-apple, CHAIR - table, 
WOMAN-man,  SPIDER-Web, NEEDLE-thread, 


crrt-boy, EAcLE-bird, sTEM-flower, LAMP- 
light, pREAM-sleep, BOY-girl, BLUE-sky, HEAD- 
hair, Lonc-short, SQUARE-round, SCISSORS-CUt, 
and KING-queen. 

Table 2 shows the degree of equivalence of 
primary responses between pairs of language 
groups. The equivalence for the English-Ca- 
nadian and American student groups is strik- 
ing (.78) especially in view of Rosenzweig’s 
finding that two halves of the same French- 
French student group agreed in 75% of the 
cases only. In contrast, the English-Canadian 
group shows only 51% agreement with the 
French-Canadian which in turn has only 44% 
agreement with the French-French. The rela- 
tionship of the bilinguals to the two Canadian 
monolingual groups is also relatively strong 
and generally much stronger than the relation 
of the two monolingual groups with one an- 
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TABLE 2 M. 
Group COMPARISONS OF EQUIVALENT PRIMARY RESPONSES 
u ‘English- : French- Bilingual Bilingual 
dud. e dias |) Arein French | imEngüsh | in French | 

French-Canadian 51 AS 44 .57 .68 
English-Canadian 18 46 61 56 
American 46 52 EU 
French-French .33 45 
Bilinguals in English 59 
Bilinguals in French 


Note,—In proportions, since translated equivalents of stimuli in certain comparisons were not equivalent in others, This 


variance affected three or four stimuli in various cases. 


other, It is also evident that the content of 
the bilinguals’ associations change when they 
use their other language for responding, the 
closeness of overlap shifting from the English- 
Canadian to the French-Canadian groups as 
they change from English to French. 

Our next step was to determine if these 
same trends are apparent when consideration 
is given to all the responses obtained, not 
only the primaries. Table 3 presents “group 
overlap coefficients” of all possible compari- 
sons with the exception of the French-French 
group because these norms have not been 
published. The overlap coefficient was devel- 
oped by Rosenzweig as a means of using all of 
the responses, not just the primaries, in esti- 
mating the similarities and differences of any 
pair of subgroups in their modes of associat- 
ing to the same set of stimulus words. The 
coefficient (given in detail by Rosenzweig, 
1964, pp. 60 ff.) determines for any stimulus 


. TABLE 3 


Group OVERLAP COEFFICIENTS AND SIGNIFICANCE 
TESTS ror VARIOUS Group COMPARISONS 


Groups compared DU 
1. English Canadian-American .69 
2. French Canadian-Bilinguals, French 54 
3. American-Bilinguals, English à 
4, English Canadian-Bilinguals, English 45 
5. English Canadian-French Canadian 43 
6. French Canadian-American 42 
7. American-Bilinguals, French 40 
8, English Canadian-Bilinguals, French EU 
9. Bilinguals, English-Bilinguals, French -39 
10. French Canadian-Bilinguals, English -36 


Note.—Using the Duncan range test values, any coeffici 
that differ by ‚09 and .11 are significant at the 05 aud Ol levels, 
respectively. Thus, the English Canadian-American overlap 
(1) is significantly greater than any other (2-10) whereas there 
are no significant differences in degrees of overlap among 
Comparisons 5-10. 


how much overlap there is, considering all the - 
responses given by any two groups. For ex- - 
ample, if .75 of one group gave the response 
chair to TABLE and .50 of the comparison 
group gave the same response to TABLE, then 
.50 is taken as the “common fraction” of 
overlap. This procedure is followed for all the 
different responses given by both groups to 
any particular stimulus word and the sum of 
the common fractions is the overlap coeffi- 
cient, In the present case, coefficients were 
determined for the first and every fifth word 
of the 100 stimuli for each group. The mean 
coefficients are given at the left in Table 3. 
The reliability of the differences among means 
was tested by analysis of variance and it was 
apparent that both the differences among com- 
parison group pairs (F = 28.10, df = 9/180) 
and among stimulus words (F = 13.70, df= 
20/180) were significant beyond the .01 level. 

The Duncan range test, a multiple group 
comparison statistic, was used to isolate the 
group comparisons which contribute to the 
overall significance. Reading from Table 3, it 
is evident that the Americans are more like 
the English-Canadians and the bilinguals in 
English than they are like the French-Canadi- 
ans or the bilinguals in French. The English- 
Canadians, in turn, are more like the Ameri- 
cans than they are like any of the other three 
Canadian groups. The French-Canadians are 
more like the bilinguals in French than they 
are to any of the other three groups. In fact, 
they are equally distant in response overlap 
from the Americans and the English-Canadi- 
ans. The bilinguals in English are equally dis- 
tant from English- and French-Canadians. 
They are even more like Americans than 
French-Canadians and slightly more like 
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gesting that they may model on American 
speakers of English as much or more than on 
.English-Canadians. On the other hand, the 
bilinguals in French are more similar to the 
French-Canadians than to the English-Ca- 
nadians, the Americans, or the bilinguals in 
English, This pattern of findings is essentially 
similar to that produced by the analysis of 
primaries only, except that no statements can 
be made about the relations with the French- 
French, In general both analyses indicate 
that: (a) The major difference in the content 
of associational responses uncovered in these 
comparisons is that between the American and 
the French monolingual groups. The similar- 
ity of the English-Canadian monolinguals and 
American responses suggests that the English- 
Canadian may model on American speech 
habits to a marked degree. (b) The English- 
and French-Canadian monolinguals are rela- 
tively dissimilar in their response content. 
(c) The bilinguals have quite distinctive as- 
sociations in their two languages, especially 
noticeable when all responses are considered. 
However, because of these language discrep- 
ancies, the bilinguals become very much like 
the two Canadian monolingual groups as they 
switch from one language to the other indi- 
cating their presumed dependence on the 
monolingual groups for their skill in both 
languages and their potential role as linguistic 
mediators for the two monolingual groups. 
This point will be taken up in the discussion 
to follow. 


eSuperordinate Responses 


Jenkins and Russell (1960) devised an ob- 
jective means of determining which responses 
given to the Kent-Rosanoff stimuli were super- 
ordinates and found that 39 of the 100 stim- 
uli could call out unambiguous superordi- 
nates, For the French version of the list, two 
of these had to be eliminated (BLossom and 
MUTTON) for the reasons given by Rosen- 
zweig (1964, p. 65); the stimulus ANGER was 
also eliminated because our bilingual judges 
felt that its superordinate (“emotion”) was 
somewhat ambiguous in both French and 
English, Thus, the proportions given below 
are based on 36 words for the French-Ca- 
nadians and the bilinguals in French and on 
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38 words for the English-Canadians and the 
bilinguals in English. 

‘The superordinates accounted for the fol- 
lowing percentage of responses: for French- 
Canadians, 8.58; for English-Canadians, 8.12; 
for bilinguals in English, 7.20; and for bi- 
linguals in French, 6.45. All four of these are 
substantially like the 8% found for the Amer- 
icans and they all contrast with the 3.3% for 
the French-French students. 


Discussion 
Associational Response Diversity 


It was found that the group of American 
students was the most stereotyped in their 
associational responses and the French-French 
students least so, with the English-Canadians 
very similar to the Americans and the French- 
Canadians very similar to the French-French. 
The bilinguals placed between the English- 
and French-Canadian monolingual groups 
when responding in English and between the 
French-Canadians and French-French when 
using French, These findings not only support 
those of Rosenzweig (1961) on European and 
American group differences, but they also ex- 
tend the contrasts with Americans to include 
French-Canadians (ie. North Americans) 
who show as much response diversity as the 
European groups do. 

One can only speculate about the reasons 
for these differences. There is likely not much 
difference between the English- and French- 
Canadian monolingual groups with respect to 
their selection for schooling, one possibility 
given by Rosenzweig to account for French- 
French and American student differences. Al- 
though the French-Canadian students were at 
lycées, they would not likely be more care- 
fully selected than the English-Canadian high- 
school students, most of whom were from 
middle or upper-middle class districts and 
preparing for university study. It is more 
likely that differences in educational experi- 
ences play an important role. Compared to 
the Americans and English-Canadians, the 
French-Canadians, like the French-French, 
follow a so-called classical program of instruc- 
tion where much less use is made of short- 
answer-type questioning than of composition 
writing which stresses the development of 
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ideas. There may be, however, a more gen- 
eral cultural contrast reflected here between 
French people, both in France and Canada, 
who may place more value on verbal brilliance 
and linguistic wittiness than Americans who 
in turn may value more the ability to talk 
the other fellow's language. The present find- 
ings are consonant with an earlier study by 
Lambert (1956) who noted that English- 
French bilinguals from France were less 
stereotyped than American monolinguals in 
both their French and English associations, 
suggesting that some cultural factor affects 
associational stereotypy. However, various 
lines of research are clearly needed to actually 
account for these cultural differences, and to 
determine whether these differences hold for 
other than student groups in the countries 
being compared. 

The modifications of response stereotypy 
noted as bilinguals change their language of 
response is of interest, first because it indi- 
cates that, in becoming bilingual, they are 
able to incorporate such relatively subtle fea- 
tures of the two languages, and second be- 
cause it suggests that associational response 
diversity may be one of the features of the 
languages learned that assist bilinguals in 
keeping their two languages functionally sepa- 
rated, a matter of theoretical concern (see 
Lambert, 1963; Lambert, Havelka, & Crosby, 
1958). It is also of interest that not all bi- 
linguals are able to modify their responses to 
accord with those typical of the two languages 
they have learned, The French-French bi- 
linguals referred to above (Lambert, 1956) 
were as diversified in their English as in their 
French responses, In contrast to the bilinguals 
examined here, they were adult French na- 
tives, many of them teachers of French in the 
United States, who had learned English at 
school and in American communities later in 
life. In other words, their English skill, al- 
though brought up to a high level, was appar- 
ently a recent overlay on a basic French lan- 
guage system. 


Equivalence of Associational Content 


When comparing the equivalence of re- 
sponse content from group to group, we found 
some pairs agreeing on primaries in 7896 of 
the possible cases and others, only 33%. In 
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Rosenzweig’s (1961) study, he found agree-- 
ment of primaries between American and. 
European groups ranging from 35 to 48%. He 
felt these percentages of equivalence were. 
artificially low because: he was dealing only 
with primaries, thereby missing equivalences 
in the rest of the responses given; because the 
translations of stimulus words were often not 
equivalents; and because the methods of col- 
lecting the responses were not standardized. 
Rosenzweig, in fact, believed that the agree- 
ments were strong enough to indicate a “cross- 
cultural community which shares verbal asso- 
ciations and meanings [p. 357]." In the pres- 
ent case, we can have somewhat more confi- 
dence in the percentages of equivalence not 
only because the range is greater but also be- 
cause all responses (except for comparisons 
with the French-French group) as well as 
primaries can be compared, the translations 
have been more exactly matched, and the 
procedures used for collecting the responses 
were standardized. 

Because we are as interested in group dis- 
similarities of associational content as in the 
similarities, it is necessary to explain briefly 
how we interpret associational responses. We 
view them as particular connotative meaning 
networks, parts of which may be activated 
whenever their appropriate stimulus words 
are either decoded or are about to be encoded. 
The networks not only convey emotional con- 
notations but also direct the train of thought 
as particular stimulus words are encountered. 
This view is similar to that of Carroll (1964) 
who interprets word-association responses as 
“some part of an assemblage of mediating” 
processes [p. 100]” reflecting “the variety of 
experiences represented in a concept [p. 
101].” The present view is also basically re- 
lated to Deese’s (1962) notion of “associative 
meaning.” Deese noted that two words often 
do not elicit one another as associational re- 
sponses yet they do have a great many asso- 
ciations in common. For example, samples of 
undergraduates gave the associations note, 
song, sound, noise, music, and orchestra to 
both Prano and sympHony. Such words, he 
argues, have much associative meaning in 
common and are thus linked within some 
general concept. In the present case, the same 
or translated equivalent stimulus word may 
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have the same or different associational mean- 
ings for different groups and when the asso- 
ciational systems are discordant, the social 
communication may be disrupted. For exam- 
ple, the English-Canadian and bilinguals in 
English gave the primary response God to 
the stimulus word BIBLE whereas the French- 
Canadians and the bilinguals in French gave 
livre (book) as their primary responses. In 
communication, the two monolingual groups 
could easily miss the full significance of one 
another's messages because such associational 
discordances color the meaning and shunt the 
line of associations off on quite different 
routes, In this example, the bilinguals would 
likely transmit the discrepancy with fidelity 
from one monolingual group to another, 
switching from one associational network to 
another as they change languages. The dis- 
cordance would accumulate when sequences 
of ideas, as in sentences, are socially trans- 
mitted. For instance, one might want to relate 
the concepts CHILD, SICKNESS, and DOCTOR. 
The primary associates to these words are: 
mother, health, and nurse for our English- 
Canadian subjects and baby, hospital, sick- 
ness for the French-Canadians. If bilinguals 
are used to transmit this message, additional 
distortion is likely since the primaries are 
mother, bed, sick for the bilinguals in English 
and baby, bed, sickness for the bilinguals in 
French, 

Assuming then that these group differences 
in associational correspondence determine in 
part the difficulty or ease of communication 
both across and within linguistic groups, we 
ecan profit from a reexamination of Table 2. 
In view of the long-term tensions between 
French- and English-Canadians in Quebec, 
attention is first drawn to the relatively low 
correspondence between their associational 
networks (.51), much lower than that between 
English-Canadians and Americans (.78). The 
communication of connotative significance 
between members of these groups might be 
improved through the bilinguals who have 
Somewhat more associational similarity with 
themselves in their other language (.59) and 
Who make better contact with each of the mono- 
lingual groups, especially when using the same 
language as the monolinguals (bilinguals an, 
English and English-Canadians = .61, bi- 
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linguals in French and French-Canadians = 
.68) but also when using the other language 
(bilinguals in English and French-Canadians 
= ,57, bilinguals in French and English-Ca- 
nadians = .56). The bilinguals are potential 
sources of rapprochement between these two 
relatively discordant monolingual groups and 
they might be used as linguistic mediators to 
help prepare, transmit, and receive messages 
from one group to the other. 

The discordance in the networks of the 
English- and French-Canadian monolingual 
groups might also be improved on if the 
French-Canadians were to move toward the 
American pattern (equivalence of primaries 
for the French-Canadian and Americans is .45 
compared to the .78 for the English-Canadian 
and American groups). Since the associational 
pattern of the French-Canadian group is 
relatively isolated from all of the others, 
French-Canadians may realize their difficulty 
in expressing the full meaning of their ideas 
and thereby sense a certain pressure to adjust 
to either the English-Canadian and American 
pattern, or at least the French-French pattern. 

Finally, the English-Canadian associational 
contact with the French-French (.46) could 
be made worse by the use of bilinguals in Eng- 
lish who have relatively low equivalence of 
primaries with the French-French (.33). The 
distortions would come mainly from transmit- 
ting messages from the English-Canadian 
monolinguals to the bilinguals in English, but 
even then the relay of the message through 
the bilinguals’ French would not likely be 
good (bilinguals in French and French-French 
= 45). Possibly English-French bilinguals 
from France might be better mediators in this 
case. 

These notions need the support of further 
normative studies with more careful selection 
of bilinguals who have demonstrable skill of 
equal power in both languages. If these norms 
are reliable, then laboratory investigations of 
bilinguals transmitting and receiving mes- 
sages in each of their languages could be at- 
tempted as a means of studying ease and dif- 
ficulty of communication. 


Superordinate Responses 


The fact that the French-French students, 
but not the French-Canadians, gave so few 
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superordinates eliminates the possibility that 
the French language limits superordination in 
some way—an inference one might draw from 
Rosenzweig's (1964) study. The analysis of 
superordination also makes it clear that the 
French-Canadian students are similar to the 
French-French only in their relative diversity 
of responses, but are as different from the 
French-French as the Americans are with 
regard both to the content of their responses 
and their use of superordinates. The English- 
Canadian norms, however, are very similar to 
those of the Americans on all three counts. 
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9 combat engineering squads competed in their training and garrison duties to 
test the hypothesis that intergroup competition promotes close interpersonal 
relations among group members and improves morale and adjustment. 18 squads 
for whom no changes in training were introduced served as controls, Question- 
naire measures of interpersonal relations and adjustment were obtained before 
and after a 3-mo. experimental period. Changes in self-perceptions and re- 
actions to military life showed a relative improvement in adjustment of the 
members of competitive squads as compared with members of control squads. 
Men trained under competitive conditions also had a lowered level of manifest 
anxiety on the Taylor scale. Improvement in the quality of interpersonal rela- 
tions was indicated by a significantly greater change in within-squad socio- 
metric choices of combat leaders and work partners for the members of 
competitive squads. However, these improvements did not generalize to nontask 


aspects of relations among squad members. 


Can military and industrial work groups be 
engineered to be psychologically adjustive for 
their members? This question is of obvious im- 
portance from a scientific point of view as 
well as for the management of organizations 
which must operate under anxiety-arousing 
and stressful conditions, A research program 
directed by Fiedler and his associates (Alex- 
ander & Drucker, 1960; Fiedler, 1962; Fied- 
ler, Hutchins, & Dodge, 1959; McGrath, 
1962; Myers, 1962b) has attempted to iden- 
tify quasi-therapeutic interpersonal relations 
and situational determinants which promote 
or facilitate the adjustment of the individual 
group member. The underlying hypothesis of 
the research program has been that close 
positive interpersonal relations between the 
individual and his fellow group members con- 
stitutes one important determinant in the 
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process which leads to psychological adjust- 
ment, Any interpersonal and situational fac- 
tors which promote such close positive inter- 
personal relations should therefore prevent or 
alleviate psychological maladjustment in small 
groups. 

The work of Paterson (1955), Harvey 
(1956), Haefner, Langham, Axelrod, and 
Lanzetta (1954), and Sherif and Sherif 
(1953) indicates that members of a group 
become more cooperative and cohesive when 
their group is confronted by a common threat. 
Such a common threat, posed by a common 
enemy or opponent, should lead to closer in- 
terpersonal relations and hence to increased 
adjustment. Myers (1962a, 1962b) in two 
earlier studies under this program, has uti- 
lized this principle in testing the hypothesis 
that groups in competition with one another 
constitute a more quasi-therapeutic environ- 
ment than comparable groups which do not 
compete with one another. His study of rifle 
teams showed that competitive group condi- 
tions had adjustive effects even for men in 
losing teams. A second study (1962a) investi- 
gated the effect of competitive and noncom- 
petitive team golf on hospitalized schizo- 
phrenics. This investigation again showed that 
patients in relatively good reality contact im- 
proved in their adjustment and social inter- 
actions as compared with a comparable con- 
trol group in which members of the twosomes 
were rewarded irrespective of their perform- 
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ance, The investigation reported here is the 
third in the series specifically concerned with 
quasi-therapeutic effects of intergroup compe- 
tition, 

Previous studies were conducted with ad 
hoc groups under controlled conditions. The 
present investigation extends the study of 
quasi-therapeutic competitive conditions to a 
field setting in which the group is of consid- 
erable salience to the member. Specifically, 
this study tested the hypothesis that small 
military groups, which compete in their nor- 
mal training and garrison activities, will im- 
prove to a significantly greater extent in their 
interpersonal relations and in the psychologi- 
cal adjustment of their members than will 
controls receiving routine training. 

However, as Deutsch (1962) has recently 
noted, the prototypic condition of completely 
cooperative relations among group members 
does not exist in the natural setting. “The 
members of a basketball team may be coop- 
eratively interrelated with respect to winning 
the game but competitive with respect to be- 
ing the ‘star’ of the team [pp. 277-278]." 
Men in military groups must also cooperate 
in the attainment of some common goals but 
may compete in order to achieve some per- 
sonal goals such as special consideration from 
superiors, recommendation for promotion, or 
simply avoidance of undesirable details, The 
present investigation attempted to make the 
group situation more adjustive by shifting the 
relative emphasis to competitive relations be- 
tween groups rather than among the indi- 
vidual members of the same group. 


METHOD 
Subjects 


Men from 27 squads of a combat engineer bat- 
talion participated in the experiment as part of their 
normal training. This battalion was a combat-ready 
unit. The 178 men for whom complete data were 
available had served in the Army at least 6 months 
and had undergone a minimum of 16 weeks of indi- 
vidual basic training as well as additional training 
in their combat engineer specialties, The men ranged 
in age from 18 to 46, with a median age of 22 years; 
their educational level ranged from 8 to 16 years of 
School with a median of 12 years, 


Design 


The combat engineer battalion consisted of three 
field companies as well as headquarters and sup- 
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porting troops. One of the field companies was - 


randomly chosen to receive the competitive training 
condition (EC). The two control companies (CC) 
proceeded with training and duties on the post as 
before, without special instructions as far as the 
experiment was concerned, 

Training, routine tasks and recreational activities 
in the experimental company were designed to em- 
phasize competition among the 9 squads in the 
company while the 18 squads in the two control 
companies were not subjected to any experimental 
changes in training. Criteria of personal adjustment 
were obtained from questionnaires which were ad- 
ministered before and after a 3-month period during 
which the competitive treatment was applied. Squad 
averages were computed from the individual adjust- 
ment measures. The general design of the investiga- 
tion was thus a standard before and after compari- 
son of groups which had been assigned to the ex- 
perimental and control conditions. 

Only the battalion commander, the battalion execu- 
tive officer, and the commander of the experimental 
company were informed about the study. These offi- 
cers worked closely with the project staff in devising 
and administering the experimental treatment. None 
of the other officers or enlisted men in the battalion 
knew about the experiment or were aware of any 
experimental changes in training, as indicated by 
interviews held after the conclusion of the study. 
These interviews also showed that the men assigned 
no particular significance to the questionnaires which 
were administered as “part of a survey for a 
medical research project.” 


Competition among Squads 


Life in the military is characterized by continual 
surveillance and evaluation. A day may begin with 
a personal clothing inspection at reveille. Before 
breakfast has settled, a man finds he must clean and 
arrange his barracks area and then “police” the sur- 
rounding landscape to meet the requirements of still 
another inspection. Training activities also include 
this regular evaluation of performance. Such a rou- 
tine offers innumerable opportunities for comparisons 
among individuals and among military units, to- 
gether with the possibility of differential recognition 
and reward, 

In essence, the present experimental treatment 
consisted in making these training and garrison ac- 
tivities into contests among the squads. Demerits, 
commendations, 3-day passes, and other organiza- 
tional rewards were earned by squads as a whole, in 
contrast with the control procedure of rewarding or 
punishing the individual. The investigators inter- 
mittently reviewed the procedures that were followed 
in both the experimental and control companies. 
These observations indicated that the two groups 
received approximately the same amount of re- 
wards throughout the experimental period. Empha- 
sis on intersquad competition was introduced into 
the experimental company (EC) very gradually, and 
eventually encompassed the various inspections of 
barracks areas, personal equipment, squad weapons, 
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personal weapons, vehicles, etc. Training exercises 
were also conducted competitively by comparing 
squad performances, for example, in obstacle course 
maneuvers, bivouac, and in recreational games. The 
commander of the experimental company provided 
rewards for squad contests and judged the quality of 
squad performance. It should be stressed that the 
study involved no activities which were especially 
devised for the sake of the experiment. The military 
situation provided more than ample opportunity for 
intersquad competition, and none of the experimental 
procedures departed from accepted training pro- 
cedures and military doctrine. The study merely 
called for a change in emphasis, designed to enhance 
the integrity and importance of squad membership 
and thus to heighten the sense of promotive relations 
among squad members. 

Data analysis was limited to men who were 
present for both questionnaire testing sessions. The 
number of men present per squad varied from 2 to 9. 
The average number of men per squad included in 
the analysis was 4.4 for both the control and the 
experimental companies. Subjects attrition from pre- 
to posttest sessions was primarily due to regular 
military procedures (e.g, transfers, emergency fur- 
loughs, temporary duty assignments, etc.). There was 
some loss of data due to incomplete questionnaires. 
Both experimental and control groups, however, 
showed approximately equal attrition of subjects 
due to these sources. There was no evidence of 
differential selectivity *or loss of subjects. Care was 
also taken to avoid involving company and bat- 
talion personnel in the administration of question- 
naires. Attendance at testing sessions was com- 
pulsory for all personnel and men were excused 
Írom details to permit them to attend the testing 
sessions. 


Adjustment and Interpersonal Relations 
Criteria 


Self-report adjustment measures. The question- 
naire battery contained three major indices of squad 
, member adjustment: the Taylor Manifest Anxiety 
(MA) scale, a 12-item evaluative self-description 
using a semantic differential format, and a Personal 
Reaction Form (PRF) composed of a set of graphic 
tating items which ask about reactions to military 
life and squad training. 5 

The MA scale has been described elsewhere in 
detail (see Taylor, 1953). Choice of this measure 
as an index of adjustment is justified by Hoyt and 
Magoon (1954) and Buss (1955) who reported 
that MA scores are clearly related to such signs of 
tension and anxiety as hesitant speech, perspiration, 
and nervousness. Fiedler et al. (1959) also used MA 
Scores successfully as one criterion of adjustment 
of men in small college and military groups. 

The semantic differential self-descriptions were 
composed of 12 8-point scales bounded by bipolar 
Adjectives. The particular bipolar adjectives used 
ere were chosen to reflect the evaluative dimen- 
sion of connotative meaning, for example, “good- 
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bad” and “friendly-unfriendly” (see Osgood, Suci, & 
Tannenbaum, 1957). 

To determine whether the various “self-esteem” 
items could be combined into a single score these 
12 scales were intercorrelated for the entire pre- 
session sample and factor analyzed separately for 
the descriptions of self, least-preferred co-worker, 
squad leader, and fellow squad member. Nine scales 
loaded sufficiently on the first factor in the four 
analyses to indicate consistency in the favorability 
of subject self-ratings. These were: pleasant-un- 
pleasant, friendly-unfriendly, good-bad, lazy-hard- 
working, distant-close, cold-warm, selí-assured-hesi- 
tant, efficient-inefficient, and fair-unfair. In addition, 
three scales appeared to define a second factor. 
These scales were: patient-impatient, sad-happy, 
and tense-relaxed; they again clustered in all four 
analyses. This factor was labeled “emotional ad- 
justment” and was scored separately. Both the “self- 
esteem” and “emotional adjustment” ratings were 
obtained by summing the responses across the ap- 
propriate scales, The most favorable rating was 
scored 8, and least favorable, 1. Self-esteem scores 
thus could range from 72 to 9, and the emotional 
adjustment rating from 24 to 3. 

Eleven graphic rating items were included in 
both the pre- and postsession Personal Reaction 
Forms. These PRF items were similarly inter- 
correlated and factor analyzed for the presession 
sample to identify those items which clustered. 
The items had been designed to obtain squad member 
reactions to, and perceptions of, various phases of 
Army life. Four clusters were identified and given 
the following labels: general adjustment to Army 
life, perceived integration of the squad, identifica- 
tion with own squad, and perceived harmony of the 
squad. Cluster scores were obtained for each man 
simply by summing his component item ratings. 

Objective indices of adjustment. Additional in- 
formation relevant to military adjustment was ob- 
tained at the termination of the experimental train- 
ing period from disciplinary and medical records of 
the battalion. 

The incidence of recorded disciplinary problems 
for the men of these units was quite low, and 
scores are, therefore, highly unreliable. It was pos- 
sible, however, to compile a rough score for each 
squad. These scores indicated the history of any 
disciplinary problems for the men in each squad 
during the 3-month experimental period. Each score 
was the cumulative incidence of any of the fol- 
lowing: courts martial, disciplinary citations sub- 
mitted to the company commanders by the military 
police, and disciplinary actions initiated by the 
respective company commanders. 

To assess the probable psychogenic component of 
medical problems, a record was maintained of those 
men who reported for sick call during the 3 
months of the investigation. As part of this record, 
the medical officer in charge of the dispensary rated 
each man who reported for sick call on the degree 
to which his complaint seemed to have a psycho- 
genic origin. Those records were used to calculate 
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a cumulative score for each squad in the battalion. 
The score was a function of the relative frequency 
of sick call visits multiplied by the psychogenic 
rating given each complaint by the attending medical 
officer, Although these scores did not permit the 
more sensitive before-after comparisons, postexperi- 
mental averages could be compared for the experi- 
mental and control squads. 

Interpersonal relations among squad members. 'The 
questionnaire battery also contained three types of 
measures indicative of the quality of the inter- 
personal relations among squad members. First, 
semantic differential descriptions of fellow squad 
members generated "interpersonal esteem" scores. 
These scores, calculated in the same manner as the 
self-esteem score described above, indicated the 
favorableness with which squad members perceive or 
judge one another. Such scores have previously been 
found related to expressions of positive feeling by 
group members (Julian & McGrath, 1963) and suc- 
cessful task performance (McGrath & Julian, 1962; 
Myers, 1962b). 

Four sociometric nomination items comprised the 
second index of squad interpersonal relations. These 
four items asked for nominations of these: with 
whom you could talk over personal problems, with 
whom you would like to go on pass, who would 
make the best leaders for a combat mission, and 
with whom you work best. Three nominations were 
obtained for each question. A score was obtained 
by calculating the number of fellow squad members 
who were chosen by the men in a squad, as a pro- 
portion of the total choices they made. Thus, the 
higher the score, the greater the proportion of 
within-squad choices of confidants, friends, leaders, 
and work partners. 

A third assessment indicative of the interpersonal 
relations among squad members was a measure of 


TABLE 1 


AVERAGE CHANGE IN ADJUSTMENT FOR MEMBERS OF 
COMPETITIVE AND CONTROL SQUADS 


Squads 
Variable 

Competitive Control 
Self-esteem +1.9 —1,.5* 
Emotional adjustment +1.3 — e 
Anxiety (MA scale) —14 + 67% 
Adjustment to Army life + 94 — 8g 
Perceived squad integration | -+ .92 — .38* 
Identification with squad +1.77 .60 
Perceived squad harmony -n —1.19 
P.rccived competition +1.22 + 06H 

among squads ! 


Note.—Statistical evaluation of 
EOD DUE M Exon squads siepe the 
procedure for orthogonal comparisons ai 
discussed in Edwards (1960, p. 140). tt, eee 
*p <.10, df = 1/24. 
** p <.05, df = 1/24. 
Kp < 025, df = 1/24. 
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interpersonal knowledge or familiarity suggested bro 


the battalion commander. We adopted an inventory 
of interpersonal knowledge developed by Havron 
and McGrath (1961) which requires each man to 
provide biographic information about his fellow. 
squad members, for example, his home state, number 
of years of schooling received, and his middle name. - 
Each man's familiarity score was computed as a ratio 
of the amount of correct information he gave to the 
total of possible correct answers. A high score 
showed that a man had proportionately close famil- 
jarity with his fellow squad members. This score 
is presumably related to the degree of interpersonal 
contact and communication among squad members, 


RESULTS 
Improvement in Adjustment 


According to the major hypothesis of this 
study, intersquad competition should lead to 
improved interpersonal relations and squad 
member adjustment. Table 1 presents the sta- 
tistical comparison of adjustment for the 
competitive and control squads. 

Changes in self-esteem and emotional ad- 
justment ratings indicated clearly the relative 
improvement in the self-perceptions of the 
squad members in competitive groups as com= 
pared with members of other squads. Squads 
trained under competitive conditions also had 
a lowered level of manifest anxiety on the 
Taylor scale. This improvement in personal 
adjustment was also observed in the men’s 
reactions to military life. Improvement in 
general adjustment to the Army was indicated 
in responses to the items: “How strongly do 


you want to make a career of the Army?” : 


and “How satisfied and contented have you 
been with military life?” (Cluster 1, PRF). 
Perceived Squad Integration (Cluster 2) in- 
cluded responses to the items: “How well do 
you feel you have gotten to know the other 
men in your squad?” “How similar do you 
feel that the men in your squad are to one 
another?” and “How well do the members of 
your squad work together?” A similar pattern 
of improvement was shown by the clusters 


designated “Identification with the Squad" - 


and “Perceived Squad Harmony.” In total, 
these changes point to a marked improvement 
in squad member morale, satisfaction, and 
personal adjustment under the competitive 
training conditions. \ 

Table 1 also presents a comparison of the 
“perceived competition” among squads under 
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the experimental and control training condi- 
tions. This item was included in the PRF 
and provided a check on the impact of the 
experimental competitive squad training. 


Improvement of Squad Interpersonal Rela- 
tions 


Myers’ study had shown that the competi- 
tive condition led to better interpersonal rela- 
tions. This result was also expected in the 
present investigation. As shown in Table 2, 
changes in the hypothesized direction oc- 
curred in six of the seven measures, with two 
of these statistically significant. Interestingly 
enough, “esteem received” scores which re- 
flected the favorableness with which squad 
members judged one another did not change 
significantly in competitive squads. Indeed, 
the scores showed a slightly greater improve- 
ment under control conditions than experi- 
mental conditions. Friendship choices for the 
sociometric nomination items paralleled the 
findings obtained for interpersonal esteem. 
These changes were shown in nominations of: 
"someone with whom you would talk over a 
personal problem" and “someone with whom 
you would like to go on a pass." Changes in 
intrasquad familiarity similarly paralleled the 
shifts in interpersonal esteem. Experimental, 
competitive squad members did not increase 
in their personal knowledge of one another. 

The remaining two sociometric questions 
which are related to work and task relations 
among squad members did indicate a signifi- 
cantly greater change in intrasquad relations 
for competitive squads. Under the competitive 

«training condition choices of “combat leaders” 

and “work partners” were more frequently 
made from among fellow squad members. 
Hence, at least for these small military 
groups, group competition led to an improve- 
ment in both the “work relations” and per- 
sonal adjustment of the men. These effects, 
however, did not generalize to other relation- 
ships among squad members. 


Objective Indices of Adjustment 


Only terminal, postsession scores were 
available to measure medical and disciplinary 
Problems in the squads. The average post- 
session levels for the experimental and control 
squads were compared using the Mann- 
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TABLE 2 


AVERAGE CHANGE IN INTERPERSONAL RELATIONS 
or SQUAD MEMBERS 


Squads 
Variable 
Competitive | Control 
Esteem received from fellow +2.6 +45 
squad members 
Perceived emotional adjustment + 91 + .80 
Sociometric nominations 
Person with whom you would + .25 + A7 
talk over personal problems 
Person with whom you would + 16 | + .09 
like to go on pass 
Person who would make +25 | — ,.03** 
best combat leader 
p with whom you work + 24 | + .05* 
est 
Intrasquad familiarity + 32 |+ 40 


Note.—Statistical evaluation of differences between the 
competitive and control squads was calculated using the 
procedure for orthogonal comparisons among treatment means 
discussed in Edwards (1960, p. 140). 

*p <.05, df = 1/24. 

**» <.01, df = 1/24. 


Whitney U test on the ranked scores, No dif- 
ferences were found between conditions either 
for frequency of sick call visits adjusted 
for apparent psychogenic origin (p> .05, 
U < 71), or history of disciplinary problems 
during the 3-month experimental period 
(p> .05, U < 61). 


Discussion 


A possible limitation of the study lies in 
the fact that only one experimental and two 
control companies were used. In such a field 
study as this it is desirable to have several 
experimental units and several control units 
in order to take account of the variance due 
to sampling intact groups rather than indi- 
viduals, For a number of reasons, this was not 
possible in the present study. Our experi- 
mental and control companies might have dif- 
fered on certain critical variables, and these 
differences could account for our results. 
However, our data reveal no large pretest 
differences between experimental and control 
groups. Thus, it seems reasonable to conclude 
that the experimental results are valid. 

The major significance of this study lies in 
the fact that we have been able to validate, 
under field conditions, the hypothesis that 
intergroup competition has quasi-therapeutic, 
adjustive effects on team members, The study 
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supports the assumption that intergroup com- 
petition leads to improved work relations in 
the group, and to higher selí-esteem, lower 
anxiety, and greater satisfaction with the con- 
ditions of group life. The quality of the inter- 
personal relations improves in task-related 
aspects but not necessarily in such aspects 
as wanting to go on pass with fellow squad 
members, or wishing to talk over personal 
problems with them. The improvement ap- 
pears to be primarily in the trust and con- 
fidence the individual feels for his fellow 
group members, as shown by the significant 
increase of sociometric nominations of persons 
who would make a good combat leader and 
those with whom the individual can work 
best. Our data obviously cannot tell us 
whether the interpersonal relationship im- 
proved because the squad members became 
better adjusted, or whether the squad mem- 
bers adjusted because the interpersonal rela- 
tionships improved. It seems reasonable to 
assume, however, that the latter is the case, 
or else, that improvement in adjustment and 
interpersonal relations occurred hand in hand. 
On the basis of our previous work (Fiedler 
et al., 1959), we are inclined to feel at this 
point that the improvement of the interper- 
sonal relationship was causal to the improve- 
ment in adjustment. This interpretation is 
also clearly suggested by the findings of Alex- 
ander and Drucker (1960) who found that 
the experimental modification of group mem- 
bers' interpersonal perceptions, and hence, 
interpersonal relations, affected self-ratings of 
adjustment on the part of the individual who 
was the object of the perceptions, 

While the results of this study, especially 
those relating to self-ratings and anxiety 
Scores, are quite clear in support of our origi- 
nal hypothesis, the importance of this experi- 
ment is probably in the demonstration that 
task groups under field conditions can be 
engineered by appropriate environmental ma- 
nipulation to contribute to the individual 
group member’s adjustment. It further shows 
that such effects can be accomplished through 
administrative channels within the context of 
routine operational conditions rather than 
through the intervention of mental health 
specialists. This quasi-therapeutic effect is all 
the more striking inasmuch as combat engi- 
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neer squads are not especially noted for the 
therapeutic environment which they provide 
for their members. In fact, it was most un- 
fortunate for our study that the control com- 
panies began to imitate the experimental com- 
pany's emphasis on intersquad competition 
even before the experiment was concluded. 
This, no doubt, materially lessened the ef- 
fects of the manipulation which the study 
introduced. 

In light of the critical shortage of profes- 
sional personnel in the mental health field, 
the possibility of promoting better adjust- 
ment through relatively minor changes in the 
administrative structure opens an exciting 
vista of future possibilities. 

The question must be asked whether the 
experimental manipulation had a favorable or 
adverse effect upon the groups’ effectiveness. 
The data, based on military performance tests 
administered at the termination of the experi- 
ment, did not show significant differences 
between control and experimental companies. 
It is nevertheless noteworthy that the experi- 
mental company was later considered by post 
headquarters to have been the best project 
company on the post. It is also of interest 
that the entire battalion changed to the sys- 
tem of training introduced by the experi- 
mental company. There is thus ample evi- 
dence in this study that the intersquad 
competition not only aided in the adjustment 
of the individual, but that it also did not 
interfere with effective performance, if it did 
not, indeed, contribute to it. 
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PROPENSITY FOR RISK TAKING AS A DETERMINANT 
OF VOCATIONAL CHOICE: 


AN EXTENSION OF THE THEORY OF ACHIEVEMENT 
MOTIVATION * 


JOHN L. MORRIS ? 
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94 male high-school seniors were asked to choose an occupation from each of 
10 lists. Each list was representative of a Kuder vocational interest category 
and contained jobs ranging from very easy to very difficult. Choices were 
examined in those categories in which Ss perceived their probability of success 
as highest and lowest. An index of resultant achievement motivation was 
used to bisect the group, and the vocational choices of each section were shown 
to support the underlying model of risk taking. Individuals high in achieve- 
ment-related motivation chose as if they were attempting an intermediate 
degree of risk. Those low in achievement-related motivation chose as if they 
were avoiding an intermediate degree of risk. 


The purpose of this study is to extend a 
theoretical model of motivation and risk tak- 
ing (Atkinson, 1957, 1958; McClelland, At- 
kinson, Clark, & Lowell, 1953). The model 
stipulates that an individual's preference for 
an intermediate degree of risk—probability of 
success (P,-—.5)—varies directly as the 
strength of his achievement motivation and 
inversely as the strength of his avoidance 
motivation. 

Previous studies have tended to look at vo- 
cational choice either from the viewpoint of 
interest inventories or from the viewpoint of 
level of aspiration. This study is exploratory 
inasmuch as vocational choice is seen as the 
result of the interaction of the two. The in- 
tention is to show that the risk-taking pro- 
pensity of an individual is related to his 
estimate of his P, in a chosen occupation. 
This probability estimate is considered to 
have two components: One pertaining to the 
level of difficulty of the occupation and the 
other pertaining to the field of choice. In this 
Study an index of resultant achievement mo- 


1 This article has been adapted from a dissertation 
(Morris, 1964) submitted in partial fulfillment of 
the requirements for the degree of Doctor of Educa- 
tion at the University of California, Berkeley. The 
author expresses appreciation to Robert W. Moulton 
for his guidance during the research and to the com- 
mittee in charge of graduate research funds for their 
generous support. 

2 Presently with the Department of Education, 
University of California, Berkeley. 
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tivation was used. Atkinson and Litwin 
(1960) suggest that when measures of achieve- 
ment motivation (approach tendency) and - 
fear of failure (avoidance tendency) are used 
conjointly then the resultant index is more - 
closely related to certain outcome measures 
than either of the variables used independ- | 
ently. ; 

There are series of studies which have dem- - 
onstrated that the two motives have relevance 
for vocational choice. McClelland (1956) has 
shown that the person who is highly motivated 
to achieve is disposed to take moderate or 
calculated risks in preference to very specula- 
tive or very safe undertakings. Minor and 
Neel (1958) have demonstrated a significant - 
positive relationship between an individual's 
achievement motivation and the prestige rank 
of his occupational preference. Mahone (1960) 
showed that persons who are high in achieve- - 
ment motivation and low in fear of failure. 
tend to be realistic in their vocational choice 
with respect to both ability and interest. Per- 
sons low in achievement motivation and high — 
in fear of failure tend to be unrealistic. At- 
kinson and O'Connor? supported Mahone’s 
findings and also demonstrated that intel- 
ligence, as conventionally measured by a single 
score omnibus test, is an important factor as- 
sociated with the realism of vocational choice: 

3'"The Effects of Ability Grouping in Schools,” 
report to the Department, of Health, Education, and 
Welfare, 1963. 
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individuals low in intelligence tending to make 
unrealistic choices irrespective of their motiva- 
tion. Burnstein (1963) showed that, as fear 
of failure increases, the prestige of the as- 
‘pired-to occupation decreases and the willing- 
ness to settle for less satisfying and less 
prestigeful occupations increases. Burnstein, 
Moulton, and Liberty (1963) found that 
when a discrepancy obtains between prestige 
and required competence, activities demand- 
ing a high degree of excellence relative to 
prestige conferred are likely to be more at- 
tractive to individuals high in achievement 
motivation and achievement values. 

The four pictures used in this study were 
those recommended by Atkinson. The source 
of each picture and detailed instructions for 
administration are given in Appendix III (At- 
kinson, 1958). The pictures were shown in a 
group setting under neutral conditions; that 
is, a condition in which no experimental at- 
tempt is made to either arouse the motive or 
create an especially relaxed state prior to the 
writing of the stories (Atkinson & Reitman, 
1956). A measure, of test anxiety, first de- 
veloped by Mandler and Sarason (1952), pro- 
vides an assessment of the tendency to be 
anxious about failure in achievement situa- 
tions. This anxiety tends to interfere with 
efficient performance of complex intellectual 
tasks when there is external pressure to 
achieve. The self-report questionnaire for use 
with high school students, developed by Judith 
Cowen (Mandler & Cowen, 1958), was used 
in this study. 


_ Hypotheses 


Let us assume that an individual esti- 
mates his P, in general as very high in me- 
chanical jobs, as moderate in persuasive, and 
very low in artistic. Assume further that he 
is presented with a list of jobs in each of 
these fields and that each list is composed 
of jobs ranging from very easy (high P,) to 
Very difficult (low P,). It has been demon- 
Strated that senior-high-school and college 
students are able to rank occupations reliably 
In order of difficulty and that the order cor- 
Tesponds very closely to the order of occupa- 
tions on the National Opinion Research Cen- 
ter Prestige Scale (Atkinson & O'Connor *; 
Mahone, 1960). As would be expected, stu- 
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Fic. 1. A concept of the effect of an individual's 
expectancy of success in various occupational fields 
upon his expectancy of success in specific occupations 
typically equated in level of difficulty in conventional 
occupational scales. 


dents perceive that high-level or difficult jobs 
have a lower probability of success than easy 
or low-level jobs. 

It is envisaged that where an individual 
sees himself as competent or as having a high 
P, in the field (high P,j), he would bias the 
P, of all the occupations in the field upward 
so that even the most difficult job is not far 
removed from the intermediate degree of 
risk (P, — .5). Where he sees himself as in- 
competent and as having a low P, in the field, 
he would bias the P, of all occupations down- 
ward so that the easiest is not remote from 
the intermediate degree of risk. This situation 
is illustrated in Figure 1. 

The risk-taking model stipulates that in- 
dividuals high in resultant motivation will 
prefer occupations having an intermediate de- 
gree of risk. It is deduced from Figure 1 that 
this requirement would be met by the selec- 
tion of relatively difficult jobs in a field of 
high P, and relatively easy jobs in a field of 
low P,. These possibilities will be termed the 
typical bias. The model also stipulates that 
individuals low in achievement-related mo- 
tivation will prefer a degree of risk which is 
divergent from the intermediate level. Re- 
ferring again to Figure 1, it is seen that such 
an individual should prefer relatively easy 
jobs in a field of high P, and relatively diffi- 
cult jobs in a field of low P,. These possi- 
bilities are called the atypical bias. This 
brings us to the research hypotheses. 
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1. The probability that individuals who are 
high in resultant motivation make typical 
choices is greater than the probability that in- 
dividuals who are low in resultant motivation 
make typical occupational choices. 

If this prediction is supported, it should 
then be possible to compare the absolute 
rather than the relative levels of difficulty of 
choices made by persons differing in achieve- 
ment-related motivation. 

2. In making choices within occupational 
fields perceived as having a high P,, the prob- 
ability that individuals who are high in re- 
sultant motivation choose difficult jobs is 
greater than the probability that individuals 
low in resultant motivation choose difficult 
jobs. 

3. In making choices within occupational 
fields perceived as having a low P,, the prob- 
ability that individuals who are high in re- 
sultant motivation choose easy jobs is greater 
than the probability that individuals low in 
resultant motivation choose easy jobs. 

It is proposed that this first set of hy- 
potheses should be tested by asking the re- 
spondents to choose jobs from prepared lists 
representing a variety of fields of interest or 
job activity. If the hypotheses are adequately 
demonstrated, then it should be possible to 
make predictions about expressions of voca- 
tional choices which individuals make on 
leaving school. That is to say, an individual’s 
freely expressed job choice obtained from a 
questionnaire (called “career choice”) should 
represent a convergence of the risk-taking 
potential and his perceived P,. It is anticipated 
that few individuals will make a choice in a 
field of low P, and so a parallel hypothesis for 
this contingency has not been given. 

4. If a career choice is made in a field in 
which an individual estimates his P, as high. 
the level of difficulty of the choice is posi- 
tively related to his resultant motivation. 


METHOD 


_ The experimental sample consisted of 108 senior- 
high-school boys drawn from a suburban high school 
in the San Francisco Bay Area. The sample was not 
chosen at random because of administrative diffi- 
culties within the school. The boys were enrolled in 
physical education classes which drew from all sec- 
tions of the curriculum, and a comparison of measures 
of their ability, occupational interests, and socio- 
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economic status showed them to be typical of the, 
total enrollment of seniors. Twelve boys were absent 
on the first day of testing and the test protocols of 
two more were incomplete. 

The group was divided on the basis of the strength, 
of their approach and avoidance motivation. Atkinson 
and O'Connor's? technique was used whereby the 
scores of each individual on the measures of n 
Achievement and Test Anxiety were first standard- 
ized and the difference between these two standard- 
ized scores was determined. The resulting distribution 
of “resultant motivation” was then split at the me- 
dian score. 

A specially designed “Leyel of Difficulty Scale” 
was then administered to elicit the level of difficulty 
of choice within various occupational interest areas. 
Each subject was required to state a preference of 
occupation from each of 10 lists of occupations each 
comprising nine jobs. Each list contained jobs homo- 
geneous with respect to the dominant interest (Kuder, 
1956) required of an individual undertaking that kind 
of work. The ordering of difficulty of items in each 
of the 10 lists was randomized from list to list. 
Scores were assigned to each of the occupations; the 
most difficult being assigned a score of 9 and the 
least difficult a score of 1. The lists were compiled 
using the Haller Occupational Aspiration Scale 
(Haller & Miller, 1963) and the National Opinion Re- 
search Center (1953) Survey’s Occupational Prestige 
Scale as referents. The test-retest reliability of the 
instrument was established using a sample of 25 
boys drawn at random from the same group as the 
main experimental sample. They were asked to rank 
the jobs in each list in order of difficulty on two 
occasions 2 weeks apart, The median rank assigned 
on the first occasion was used as the index of dif- 
ficulty. with which a particular occupation is per- 
ceived by the senior-high-school boys used in the 
main experimental sample. 

Estimates of probabilities of success in various oc- 
cupational fields were obtained from a second spe- 
cially designed "Probability of Success Scale." The 
subject was simply asked to make an estimate of his 
chance of success in each of 10 fields (Kuder, 1956) 
of work. He responded to definitions of these fields 
by checking a line representing various degrees of 
probability of success. The test-retest reliability of 
this scale was established in the manner reported 
above. The level of difficulty of jobs chosen within 
fields perceived as having the highest and lowest P. 
was then examined for two groups of subjects: 
those high and those low in resultant motivation. 

Hypothesis 4 was based on the premise that freely 
expressed occupational preferences in response to à 
questionnaire item, “What occupation do you desire 
to enter when you have finished your schooling?" 
would follow the same pattern as preferences for 
jobs included in the prepared lists. The preference of 
each subject was classified according to its relevant 
occupational field using the Kuder manual as a 
referent; and the difficulty level of the choice was 
obtained using the Level of Difficulty scales as a 
referent. The subject's perceived P, in the field was 
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e TABLE 1 


FREQUENCY OF OCCURRENCE OF THE TYPICAL AND 
ATYPICAL BIAS FOR THE GROUP AS A WHOLE 
AND FOR SUBJECTS ABOVE AND BELOW 
THE MEDIAN IQ 


Resultant motivation 
Group and bias Total 
High Low 
Total 
Typical 38377 22 55 
Atypical 6 22 28 
Total* 39 44 83b 
High IQ 
"Typical 21 10 31 
Atypical 1 9 10 
Totale 22 19 Aper 
Low IQ 
Typical 12 12 24 
Atypical 5 13 18 
Total? 17 22 42* 


ax? = 11,05, df 1. 

b fight high resultant-motivated and 3 low resultant- 
motivated individuals did not bias at all (zero bias) and results 
for 11 subjects do not appear in this analysis. 

ox? = 7.90, df = 1, Yates’ correction. 

Pd 1.90, df = 1, Yates’ correction. 


> 10, 
** p < .05, one-tailed test. 


ascertained from the Probability of Success scales, 
It was originally intended that all analyses would 
be replicated for subjects high and low in socio- 
economic status using the Hollingshead and Redlich 
(1958) index. However, the obtained range of SES 
scores was considered too small and the variable was 
deleted. For reasons already given, IQ was introduced 
as an experimental variable. The Henmon-Nelson 
omnibus, single score measure of general ability was 
used. 

For Hypotheses 1-4, the one-tailed test of sig- 
nificance was used in analyses for the group as a 
whole and for high IQ subjects. However, the way in 
which low IQ would affect the results was not pre- 
dicted and a two-tailed test of significance was used. 


RESULTS 


According to Hypothesis 1, the probability 
that individuals who are high in resultant 
motivation make typical choices is greater 
than the probability that individuals who are 
low in resultant motivation make typical 
Choices. The frequency of occurrence of the 
typical and atypical choices or bias among 
subjects differing in resultant motivation ap- 
pears in Table 1. This table includes the re- 
sults for the group as a whole and for subjects 
above and below the group median in in- 
telligence. A statistically significant difference 
is found in the frequency of the typical bias 
between subjects high and low in resultant 
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motivation (x? — 11.05, df —1, p< .05). 
Thus, Hypothesis 1 is clearly supported. 
When the results for high intelligent indi- 
viduals are examined, it is seen that those who 
are also high in resultant motivation choose 
strongly in the predicted direction (x° = 7.90, 
dj=1, p < .05). The results for low intel- 
ligent individuals are in the predicted direc- 
tion but fail to meet the required level of 
significance. 

Hypotheses 2 and 3 are more demanding 
tests of the biasing model. The earlier test re- 
quired only that the choice would be in a 
certain direction; having established this we 
may now examine the data to see if the bias 
is strong enough to cause individuals differ- 
ing in resultant motivation to choose at one 
end or the other of distributions of jobs vary- 
ing in level of difficulty. It was necessary to 
set criteria for high and low levels of difficulty. 
The four jobs ranked as easiest by students 
in each of the 10 occupational fields were 
termed low level of difficulty; the four jobs 
ranked as hardest by the same students were 
termed high level of difficulty. The job ranked 
fifth among the nine alternatives in each case 
was termed intermediate. 

Hypothesis 2 states that in making choices 
within an occupational field perceived as hav- 


TABLE 2 


FREQUENCY OF SELECTION OF DirricuLt AND EASY 
OCCUPATIONS IN FIELDS PERCEIVED AS HAVING 
A HICH PROBABILITY OF SUCCESS 


Resultant motivation 
Groupandjevel LL Teal 
High Low 
Total 
High 27 i 44 
Low 11 21 32 
Total® 38 38 d 
High I 
Lon 17 7 24 
Low 5 9 14 
Total 22 16 38% 
Low. I 
uh 10 10 20 
Low 6 12 18 
Totale 16 22 38* 


Note.— Based on forced choices from prepared lists, 
1 


n X i 
EX = 50; hAividuals chose occupations defined as "inter- 
mediates.” Results for these subjects do not appear in Table 2. 


ex? = .52,df = 
*> >.10, Yates correction. 
D> < 05. 
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TABLE 3 


FREQUENCY OF SELECTION OF DIFFICULT AND Easy 
Occupations IN FIELDS PERCEIVED AS HAVING 
A Low PROBABILITY OF SUCCESS 


Resultant motivation 
Ver a Total 
High Low 
Total 
High 13 22 35 
Low 29 22 51 
Totals 42 44 86b cok 
High IQ 
High 8 9 17 
Low 19 11 30 
Total? 27 20 47% 
Low IQ 
High 5 13 18 
Low 10 11 21 
Totale 15 24 39* 


Note.— Based on forced choices from prepared lists, 

MS = 3.23, df = 1. 

b Eight subjects chose occupations defined as “intermediate” 
level of difficulty, Results for these individuals do not appear 
in Table 3, 

© One-tailed test. 

dx? —.60,df = 1. 

ex? = .88,df = 1. 

* p. > .10, Yates’ correction, 

mp <05. 


ing a high P,, the probability that individuals 
who are high in resultant motivation choose 
diffücult jobs is greater than the probability 
that individuals low in resultant motivation 
choose difficult jobs. The estimated P, of each 
individual, in each of the 10 fields, was ex- 
amined. The area in which the estimated D 
was highest was used for further analysis. 
Table 2 shows the frequency with which easy 
and diffücult jobs are chosen by individuals 
differing in resultant motivation when the 
choices are made within fields of high P,. The 
results are statistically significant in the pre- 
dicted direction and the hypothesis is sup- 
ported (x* = 5.40, df = 1, $ < .05). The re- 
sults for high and low IQ individuals are in 
the predicted direction in both cases but they 
fail to reach the required level of significance. 

According to Hypothesis 3, in making 
Choices within occupational fields perceived 
as having a low P,, the probability that in- 
dividuals who are high in resultant motiva- 
tion will choose easy jobs is greater than the 
probability that individuals low in resultant 
motivation will choose easy jobs. As before, 
the estimated P, of each individual in each of 
the 10 fields was examined. That area in 
which the estimated P, was lowest was used 
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for further analysis. Results for the group as, 
a whole and for subjects differing in intel- 
ligence appear in Table 3 which shows the 
frequency with which easy and difficult oc- 
cupations are chosen when the choices are’ 
made within fields of low P,. A statistically 
significant difference is found in the pre- 
dicted direction in the frequency of selection 
of easy jobs among subjects high and low in 
resultant motivation (x? = 3.23, df = 1, p< 
.05). Hypothesis 3 is therefore supported. The 
results for high and low intelligent individuals 
are in the predicted direction in both cases 
but fail to meet the required level of sig- 
nificance. 

Hypothesis 4 is a more stringent test again 
of the model. The criterion variable shifts from 
the laboratory-oriented, forced-choice deci- 
sions from prepared lists to expression of ac- 
tual future intention concerning vocational 
choice. According to this hypothesis, if a 
career choice is made in a field in which an 
individual estimates his P, as very high, the 
level of difficulty of choice is positively re- 
lated to his resultant motivation. Each job 
choice was classified according to the domi- 
nant interest category involved in that line 
of work. This was done by locating the oc- 
cupation in Table 2 in the Kuder (1956) ex- 


TABLE 4 


FREQUENCY OF SELECTION OF DIFFICULT AND Easy 
ATIONS IN FIELDS PERCEIVED AS HAVING 
A HIGH PROBABILITY OF SUCCESS 


é " Resultant motivation 
of difücuty © Total 
High Low 
Total 
High 21 10 31 
Low 10 14 24 
mutet 31 24 E 
Ag. 
Tig. 15 6 21 
Low 4 4 8 
Totale 19 10 29b* 
Low IQ 
High 6 4 10 
Low 6 10 16 
Totalà 12 14 26* 


naj Note— Based on expressions of career choice per question- 
Ire. 

ax? = 431, df = 1. 

b ne-tailed test. 

ex? = .03,df = 1. 
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L miners manual, “Percentile Ranks of the 
Mean Scores of Men in Various Occupational 
Groups." If the specific occupation was not 
listed then the most similar listed occupation 
was used. By reading across the table the 
percentiles attained in each of the 10 Kuder 
categories were located. The dominant in- 
terest area was that category which showed 

. the highest percentile rank for that occupa- 
tion. The individual's estimate of P, in that 
dominant field was ascertained from the Prob- 
ability of Success scales. Estimated P,’s of 8, 


' TABLE 5 


LEVEL or DIFFICULTY OF JOB CHOICES WITHIN FIELDS 
PeRcEIVED As HaviNG A HIGH AND A Low 
PROBABILITY OF SUCCESS 
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TABLE 6 


ESTIMATED PROBABILITY OF SUCCESS AND REALISM OF 
CAREER CHOICES BY INpiviDUALS DIFFERING IN 
ACHIEVEMENT-RELATED MOTIVATION 


Resultant motivation 


High Low 
Realism of choice 
IQ IQ 
High| Low | Total | High] Low | Total 
Careerchoices made| 21 | 16 | 37 | 14 | 15 | 29 
in areas of high 


P, (realistic) 
Career choices made} 4 | 3 7 2) AL AS 
in areas of low 
P, (unrealistic) 
Total 25 | 19 | 44 | 16 | 26 | 42 


r Resultant motivation 
Perceived P, in High Low 
vocational fields 

and level of diffi- 
culty of choice IQ IQ 
High | Low | Total | High | Low | Total 
High - 
High 15| 6| 21 6 4| 10 
Intermediate 2 4 6 4 1 5 
Low 4 6 | 10 4 | 10 | 14 
Total 21 | 16 | 37 | 14 | 15 | 29 
Intermediate 
High 1 0 1 1 0 1 
Intermediate 2 0 2 1 1 2 
Low 0 1 1 0 4 4 
Total 3 1 4 2 5 7 
Low 
High 0 1 1 0 3 3 
Intermediate 0 0 0 0 1 it 
Low 1 1 2 0 2 2 
"Total 1 2 3 0 6 6 


Note.—Based on expression of career choice per question- 
* naire. 


9, and 10 were classified high and those of 1, 
2, and 3 were termed low. The remainder were 
termed intermediate. The level of difficulty of 
an occupation was found by comparing the 
chosen occupation with the list of nine jobs 
in the Level of Difficulty scale for the ap- 
propriate field. The results for the group as 
a whole are analyzed in Table 4 and a sta- 
tistically significant positive relationship is 
found between resultant motivation and dif- 
ficulty of choice (x? = 4.31, df = 1, ? < .05) 
when the choices are made in a field of high 
P,. The chi-square one sample test was used 
to test the significance of the relationship 


(Dixon & Massey, 1957). The degree of re- 
lationship was measured using the phi coef- 
ficient (Hays, 1963). The obtained coefficient 
of .28 deviates significantly from 0 at the .05 
level. Hypothesis 4 was therefore supported. 
High-resultant-motivated individuals who are 
high in intelligence choose as anticipated, as 
do low-resultant-motivated individuals who 
are low in intelligence. However, the results 
fail to meet the required level of significance. 

Specific hypotheses were not formulated 
for choice made in areas of low P, since it 
was considered that the cell frequencies would 
be too low to permit meaningful analyses. 


TABLE 7 


Kuper INTEREST INVENTORY SCORES AND THE REALISM 
OF CAREER CHOICE OF INDIVIDUALS DIFFERING 
IN ACHIEVEMENT-RELATED MOTIVATION 


Resultant motivation 


High Low 


Realism of choice 


IQ IQ 


High| Low | Total | High| Low | Total 


18 | 15 | 33 | 10} 5 | 15 


Career choices con- 
gruent with high 
Kuder scores 
(realistic) 

Career choices not 
congruent with 
high Kuder scores 
(unrealistic) 
Total 22 | 17 | 39 | 14 | 16 | 30 
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Table 5 contains additional data for us to 
examine certain questions which are sug- 
gested by earlier research; for example, Do 
high-resultant-motivated individuals tend to 
choose within areas of high P, or high inven- 
toried interest, and can we expect the reverse 
for individuals low in resultant motivation? 
Two methods of presenting the relationship 
between achievement motivation, IQ, and 
realism óf choice are shown in Tables 6 and 
7. Implications are drawn in the discussion. 


DISCUSSION 


The significant contribution of this study 
concerns the effects of self-perceptions of 
competence in vocational fields (interest 
areas) on the subjective probability of suc- 
cess in specific occupations. An individual is 
believed to accept an occupation having a 
certain probability of success for him because 
of his predilection to approach or avoid risky 
situations. In the light of the results it would 
seem illogical to assume that the perceived 
level of difficulty of a job is always analogous 
to an objectively defined level of difficulty 
such as its posiiton on a scale which is hetero- 
geneous with respect to the interest content of 
the jobs contained in it. It has been amply 
demonstrated herein that individuals choose 
jobs differing more widely in level of difficulty 
when the choices are made in a field of high 
P, than when they are made in a field of low 
P,. An examination of this kind of variance in 
level of aspiration behavior has not been re- 
ported previously in the literature. 

This study may have contributed to an 
understanding of the dynamics of vocational 
choice behavior. There is a close similarity 
between the difficulty and interest area dimen- 
sions used in this research and the level and 
field dimensions of the Roe (1956) occupa- 
tional classification. One of the implications 
of the Roe system is that the occupational 
psychologist may be able to predict choice 
by measuring an individual’s level of voca- 
tional aspiration thus locating him at a point 
on the level of difficulty dimension. The sub- 
ject would then be located on the field dimen- 
sion through the use of an interest inventory. 
The point of intersection should be, the- 
oretically, the optimal job choice for the 
individual. It is suggested that this two in- 
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dependent factors theory is insufficient to ac- 
count for the choices which individuals ac- 
tually make. One of the reasons, advanced 
here, is that individuals are concerned with 
the degree of risk involved in a situation and 
the test of their competence which they are 
prepared to undergo is related to another set 
of motivational variables. The variables con- 
sidered here were achievement motivation and 
fear of failure, or avoidance motivation. Gen- 
eralizing from the results we may say that 
the counsellor should be cautious in applying 
measures of the client’s interest and level of 
aspiration unless he knows something of the 
motivational system of the client. 

Previous research (Burnstein et al., 1963) 
suggests that prestige and excellence would be 
valued differently by each of the two motiva- 
tional groups. Although there is some evi- 
dence that high-resultant-motivated individ- 
uals are more concerned with choosing in a 
field of high P,, prestige (the choice of a dif- 
ficult occupation) did not appear to be as 
much a concern of the low-resultant-motivated 
individuals as anticipated. The results do 
show that some individuals choose jobs in an 
area in which they perceive their P, as low 
and this phenomenon was explored further. 

Mahone (1960) and Atkinson and O’Con- 
nor? suggest that realism of choice is posi- 
tively related to intelligence and to achieve- 
ment motivation. If we consider the level of 
perceived P, in the chosen field of work an- 
other index of realism, we might anticipate the 
same findings as Mahone and others. A career 
choice made in an area of high perceived P, 
(P, > 80%) was termed realistic and a choice 
made in an area of low perceived P, (P, < 
8076) was termed unrealistic. Table 6 illus- 
trates the fact that 84% of the sample of in- 
dividuals high in intelligence made choices in 
a field of high P, and that only 7096 of in- 
dividuals low in intelligence made choices in 
a field of high P,. Those high in resultant 
motivation tended to make realistic choices 
irrespective of their intelligence whereas those 
low in resultant motivation and low in intel- 
ligence made a relatively greater number of 
unrealistic choices (42%). 

There are also data available to make a 
More direct comparison between these results 
and the Mahone and others’ findings. The job 
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choices of all subjects were already classified 
according to the dominant interest field. The 
Kuder Preference Record scores of the ma- 
jority of subjects were also available. Where 
the Kuder score was above the 75 percentile 
the choice was termed realistic and if below 
this level it was termed unrealistic. The 
realism of career choices of individuals differ- 
ing in achievement-related motivation is il- 
lustrated in Table 7. The findings are in close 
agreement with the analysis based on esti- 
mated P,. A high proportion (70%) of the 
low-intelligent, low-resultant-motivated indi- 
viduals make unrealistic choices. 
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BEYOND PARKINSON'S LAW: 


THE EFFECT OF EXCESS TIME ON SUBSEQUENT PERFORMANCE? 


ELLIOT ARONSON ? axp EUGENE GERARD 


University of Minnesota 


A laboratory experiment was conducted in which, “by accident," some Ss were 
allowed too much time in which to perform a task while others were allowed 
a minimum amount of time. Subsequently, when presented with a similar 
task, and allowed to work at their own pace, Ss who were allowed excess 
time initially required more time to complete the task. Thus, going beyond 
Parkinson’s law, not only does a piece of work expand to fill the time 
available, but once it has expanded it continues to require more time. The 
phenomenon is discussed in terms of Guthrie’s theory of learning and 
Festinger’s theory of cognitive dissonance, 


Several years ago, C. Northcote Parkinson 
(1957) stated his first law: “Work expands to 
fill the time available [p. 2].” Although Parkin- 
son’s intent may have been partially whimsical, 
his law seems to coincide with casual observa- 
tion of a great deal of human behavior. Espe- 
cially credible is his classic example of an elderly 
lady spending an entire day composing and dis- 
patching a postcard, What are the processes un- 
derlying this law? One possibility is that people 
do not enjoy being completely idle; thus, a per- 
son with little to do and a great deal of time in 
which to do it may redefine the nature of the 
task so that it becomes more complex and, thus, 
more time consuming. This redefinition of task 
requirements is especially likely if the criteria 
for a “good job” are vague and ill defined. For 
example, if a teaching assistant must deliver a 
guest lecture and has only a few hours in which 
to prepare it, chances are he will be able to per- 
form the task creditably. However, if he has 2 
weeks with little else to do, he may spend his 
time polishing his phrases, rearranging his sen- 
tences, pacing, daydreaming, triple checking 
references, shuffling papers, scratching his head, 
sharpening pencils, etc. Since much of this ac- 
tivity is irrelevant to a good lecture, it is doubt- 
ful whether the finished product will be much 
more meritorious in the second case than in the 
first. 

An intriguing question arises concerning what 


1 This experiment was supported by Grant NSF 
GS 202 from the National Science Foundation to 
Elliot Aronson. We would like to thank our secre- 
tary, Judith Hilton, who served as the accomplice 
in the experiment. 

2 Now at the University of Texas. 


will happen the next time a similar task presents 
itself. It is conceivable that, because a person 
has had excess time in which to perform the task 
initially, he may come to define any similar task 
as one that requires a similar amount of time for 
“adequate” preparation. Thus, excess time on the 
first occasion may result in the felt necessity of 
spending a great deal of time on the task the 
next time it presents itself. 

In a loose molar sense this prediction bears 
some resemblance to possible derivations from 
Guthrie’s (1935) theory of learning, in which he 
states: “A combination of stimuli which has ac- 
companied a movement will on its recurrence 
tend to be followed by that movement [p. 26].” 
Although Guthrie’s theory was meant to be taken 
on a more molecular level than our prediction, the 
derivation seems reasonably clear, Thus, in pre- 
paring his second lecture, the teaching assistant 
may find himself performing “movements” simi- 
lar to those he performed the first time; that 1s, 
he may spend a significant amount of time in 
phrase polishing, pacing, and sharpening pencils— 
even if his time is now more limited than before. 

Our hypothesis is that individuals who are 
allowed excess time to complete a task on the 
first occasion will spend more time completing 
a subsequent similar task than individuals who 
have been allowed a minimum amount of time 
on the first occasion. 


METHOD 


To test this hypothesis, it was necessary to have 
individuals perform some tasks; some would be 
allowed minimum time—others excess time. More- 
over, it was essential that the time allowed be seen 
as arbitrary and accidental rather than as a reflec- 
tion of the experimenter’s opinion of the length of 
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time essential for adequate task performance. The 
subjects could then be assigned a similar task and 
they themselves could be allowed to determine how 
long to work at it. The dependent variable would be 
the amount of time spent on the second task. 


Subjects and Procedure 


The subjects were 32% undergraduates—16 males 
and 16 females—who made themselves available for 
psychological research in order to gain extra credit 
in their introductory psychology class. Eight males 
and 8 females were randémly assigned to each of 
two experimental conditions. 

The experimenter seated the subject in a room 
which contained a large table, a tape recorder, and 
a timer. He told the subject that he was in the 
process of assembling materials for a future experi- 
ment involving communication and persuasion. He 
explained that the subject would not be involved in 
an experiment as such; he apologized for this but 
assured each subject that he would receive experi- 
mental credit merely for helping him prepare his 
materials. The experimenter said that he needed a 
large number of tape-recorded speeches on a variety 
of topics, He informed the subject that his job 
would be to prepare a 2-minute talk (arguing 
against the prohibition of cigarette ads) and record 
it so that it could be used as a stimulus in future 
experiments. He told the subject that, for purposes 
of the experiments, the talks must appear spon- 
taneous, but to make it easier for him, he would be 
provided with a list of possible arguments from 
which he could fashion his own communication, The 
experimenter then gave the subject a list of argu- 
ments and asked him to examine them and prepare 
his speech. 

While the subject was looking over the list and 
the experimenter was adjusting the tape recorder, 
the independent variable was manipulated. This was 
accomplished by the departmental secretary who 
came barging into the experimental room on cue, 
and said to the experimenter: “Dr. Johnson has the 
apparatus set up and is ready to go. Could you 
come down and help him now?” The experimenter 
protested, “Gee, I was just working with somebody 


- “here.” The secretary pleaded, “But it’s all set to go 


—and you know how he is. It will only take 
minutes,” 

4 In the excess-time condition she announced that 
it would take 15 minutes; in the minimum-time 
condition she announced that it would take 5 min- 
utes. The experimenter gave in resignedly, “Only 15 


® Actually, 35 subjects appeared for the experi- 
ment. Three were terminated before any data were 
collected. One of these was excused because he mis- 
understood the instructions. One was terminated be- 
cause she expressed great anxiety about the possi- 
bility that the experiment might keep her from at- 
tending a political speech to begin during that hour. 
ird was excused during the experiment because, 


~ having recently been deceived in another experiment, 


he „Verbalized an inordinate amount of skepticism 
Tulle the experimenter was presenting the cover 
ory. 
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[5] minutes, eh? Well, OK, tell him PI be right 
down." The experimenter apologized to the subject 
(who, of course, had overheard the conversation) 
and told him that he would be back as soon as he 
could. ^Why don't you work on preparing your 
speech; you can record it sometime after I return." 
The experimenter then left the room. 

The task was an extremely easy one. The sub- 
jects simply had to choose several statements from 
a list of possible arguments and to arrange them 
in a reasonable sequence. Thus, it was assumed 
that 5 minutes would be adequate time in which 
to perform their task. This assumption proved to 
be correct. After the experiment, when queried, all 
of the subjects in the minimum-time condition felt 
that the time was adequate; none felt rushed or 
pressured, 

Care had been taken to keep the room bare of 
any stimulating material. Thus, in the excess-time 
condition, the subject was left in the room for 15 
minutes with nothing to do but assemble available 
material for a 2-minute speech. Furthermore, the 
fact that he had 15 minutes was clearly accidental 
and could not easily be attributable to the experi- 
menter’s judgment of the time requirements of the 
task, 

After 15 [5] minutes, the experimenter returned, 
apologized again, and informed the subject that he 
could now put the talk on tape. He set the timer, 
started the tape recorder, and left the room, He 
returned in 2 minutes and set up the dependent 
variable by informing the subject that he must 
now prepare a 2-minute speech on a different topic— 
the value of intercollegiate athletics. As in the 
first instance, the experimenter handed the subject a 
list of possible arguments and once again informed 
him that he might use any or all of these in pre- 
paring his speech, The experimenter told the subject 
that he should spend as much time as he thinks he 
needs to prepare a convincing speech. 

As a further refinement, half of the subjects in 
each condition were told that they would be free 
to leave (with full credit) as soon as they had 
recorded the second speech. This was done in order 
to determine whether providing an incentive for 
speedy performance would counteract the hypothe- 
sized “excess time effect” and, therefore, lead to a 
quicker performance among those subjects in the 
excess-time condition. 

The experimenter told all subjects to signal him 
(by flipping a switch) as soon as they felt ade- 
quately prepared. He then left the room and acti- 
vated a stopwatch. The dependent variable was the 
time the subjects spent preparing this speech. 

At the close of the experiment the experimenter 
interviewed each subject, probing to find out whether 
any subjects had suspected the true hypothesis of the 
experiment. Although several of the subjects enter- 
tained vague suspicions that the experiment might 
involve more than what they were told, no one 
was able to guess the actual hypothesis. The experi- 
menter then explained the experiment in full and 
discussed the necessity for the deception. 

The subjects in the excess-time (15-minute) condi- 
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TABLE 1 Pape NES i 
MEANS AND STANDARD DEVIATIONS OF SUBSEQUENT PERFOR CETME AS A FUNCTION OF 
Time Male Female Incentive No incentive Total 
EE 355.1 287.0 391.1 251.0 321.1 
SD 440.5 162.8 447.1 109.9 322.7^ 
E 
"b 453.8 482.1 477.3 458.6 467.9 
SD 299.1 226.3 261.9 269.0 256.6 


^ The M and SDs in the minimum-time condition are greatly inflated by the deviant performance of one subject. His time 


was 1,425 seconds which was 8.6 SDs above the mean of the other 
condition can be present 


of the actual performance time in this 


Subjects in this condition. Perhaps a more representative picture 
ed by the M (247.5 seconds) and SD (136.9) without this deviant 


case, or by the median for all subjects in the minimum-time condition (215 seconds). 


tion were scheduled to begin the experiment pre- 
cisely at the beginning of a class period. The sub- 
jects in the minimum-time (S-minute) condition 
were scheduled to begin 10 minutes after the begin- 
ning of a class period. This was done in order to 
make them equally close (temporally) to the begin- 
ning of the next class hour just before the operation 
for the dependent variable was begun. Thus, as the 
subjects were about to begin preparing their second 
speeches, they were equated on any time pressures 
possibly resulting from classes or other appointments 
during the next hour. 


RESULTS AND DISCUSSION 


Our hypothesis was that allowing a person ex- 
cess time in which to prepare a speech will re- 
sult in his requiring more time to prepare a sub- 
sequent speech. The dependent variable was the 
time consumed by each subject in preparing his 
second speech. In the minimum-time condition, 
subjects spent an average of 321 seconds in 
preparation; in the excess-time condition, sub- 
Jects spent an average of 468 seconds in prepa- 
ration, This difference is clearly in the predicted 
direction, 

The reader will recall that two additional 
variables were matched within each condition: 
the sex of the subject and whether or not the 
subject was provided with an incentive for quick 


TABLE 2 


ANALYSIS OF VARIANCE ON THE TRANSFORMED DATA 
FOR PERFORMANCE TIME, BLOCKED ON INITIAL 
Time, SEX, AND INCENTIVE 


Source af MS za 
Time (A) 1 .3852 
Sex (B) i 10237 ky 
Incentive (C) 1 .0087 .099 
AXB 1 0037 042 
AXC 1 0048 .054 
BXC 4 -0603 .684 
AXBXC 1 0421 478 
Error 24 -0881 $ 
*p «.05. 


performance. The means and sigmas for all 
three variables are presented in Table 1. Inspec- 
tion of Table 1 reveals that the sex of the sub- 
ject made little difference in their performance 
time across experimental conditions. Likewise, 
providing an incentive for speedy performance 
made only a slight difference in the opposite di- 
rection; that is, those who were provided with an 
incentive to finish quickly spent slightly more 
time on the task than those who were not. 

As is typical with measures of time, a plot of 
the actual scores was positively skewed. The 
indicated logarithmic transformation (Federer, 
1955, p. 47) produced a plot closely approximat- 
ing a normal distribution. An analysis of vari- 
ance was performed on the transformed data; it 
was a 2X2X2 analysis—the Major Treat- 
ment (Time) X Sex X Incentive. The analysis 
of variance is presented in Table 2. The Major 
Treatment produced a significant effect: subjects 
in the excess-time condition spent more time 
than those in the minimum-time condition ( < 
.05). Neither Sex nor an Incentive for speedy 
performance had any effect, either singly or in 
interaction. 


The results support the hypothesis: If allowed. 


excess time to perform a task, subjects subse- 
quently consume a greater amount of time in 
performing a similar task than those allowed a 
minimum amount of time to perform the initial 
task. Moreover, this phenomenon appears to be 
very stable. The subjects who were allowed ex- 
cess time continued to “procrastinate” on a sub- 
sequent task even when provided with some 
incentive for performing quickly. 

This effect appears to be due to the manner 
in which the subject construes the task. If, ini- 
tially, he spends a great deal of time in prepara- 
tion, he apparently defines the task as one that 
requires a great deal of time. Subsequently, when 
a similar task must be done, he tends to treat it 
as a difficult, time-consuming task. The present 
experiment was primarily a demonstration of the 


y- 


v 


E 


Brier ARTICLES 


phenomenon and did not shed much light on the 
actual mechanisms involved. One ‘possibility may 
involve the kind of stereotypic behavior that 
Guthrie and Horton (1946) observed in their 
classic experiment on cats. Provided with excess 
time, our subjects may have performed at a more 
leisurely pace; they may have thought more me- 
ticulously, more cautiously, more slowly. Be- 
cause of this experience, slow and meticulous 
thinking may have become their learned (stereo- 
typic) response to this kind of stimulus. More- 
over, since they had too much time, they had 
the opportunity to perform several irrelevant 
activities in the presence of these stimuli—ac- 
tivities like pacing, head-scratching, etc. These 
irrelevant “movements” may have come to the 
fore in the presence of a similar task. 

A rather different mechanism can be derived 


. from the theory of cognitive dissonance (Fest- 


iins ia 


inger, 1957). The subject’s cognition that he 
spent a full 15 minutes on the task may have 
led him to endow it with great importance in 
order to justify the relatively large expenditure 
of time and effort. This attributed importance, in 
turn, could have led him to work hard and long 
the next time a similar task presented itself. 
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Further experimentation is required to determine 
the actual processes involved. 

Regardless of the mechanism or processes, the 
phenomenon itself appears to have important 
practical implications. Our results indicate that 
people should be assigned a minimum of time to 
perform a chore. Excess time is not only waste- 
ful in and of itself but leads to the continuous 
waste of time in subsequent performance. Thus, 
to go beyond Parkinson’s law, not only does 
work expand to fill the time available, but once 
it has expanded it continues to require excess 
time—even when time is mot readily available. 
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ROLE OF SUCCESS AND FAILURE IN THE LEARNING OF 
EASY AND COMPLEX TASKS* 


BERNARD WEINER? 
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Ss learned an easy or 


norms implied that Ss were doing po 


difficult list of paired associates. On the easy task, false 
orly relative to others. On the difficult 


task, false norms implied that Ss were doing well relative to others. Results 


indicated that Ss high in resultant achievement motivation (n Achievement- 
Test Anxiety) performed better than Ss low in resultant achievement motiva- 
tion on the easy task in which failure was experienced, 


but worse on the 


difficult task in which success was experienced. Similarly, when these Ss were 


classified according to level of anxiety, 


Ss high in anxiety on the easy task, but worse on the 
dictions derived from drive theory. A theory based 
f success and failure was offered to 


are contradictory to pre 


upon the motivational consequences O 


account for the data. 


In a series of apparently conclusive studies, 
Spence and his colleagues (Spence, Farber, & 
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4909-6 from the Graduate School of the University 
of Minnesota. The author wishes to thank Sherrie 
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Birch and James J. Jenkins for their suggestions. 
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Ss low in anxiety performed better than 


difficult task. The results 


McFann, 1956; Spence, Taylor, & Ketchel, 1956; 
Taylor & Chapman, 1955) demonstrated that 
subjects high in anxiety (drive) perform better 
than subjects low in anxiety on easy paired- 
associates tasks, Conversely, on difficult paired- 
associates tasks, highly anxious subjects perform 
worse than subjects low in anxiety. These data 
have been cited by Spence (1958) to support the 


340 


Hull-Spence theoretical position, which conceives 
behavior to be a function of Drive X Habit. 

In these verbal learning studies the preexperi- 
mental associations between the stimulus-response 
members in the list and the degree of intralist 
response competition determined the difficulty of 
task. In the easy list, the responses to be associ- 
ated with the stimuli were high in the subject's 
response hierarchy when he started the task, and 
intralist competition was minimal. In the difficult 
list, the responses to be associated with the stim- 
uli initially were relatively low in the subject's 
response hierarchy, while response associates high 
in his hierarchy were included as members of 
other S-R pairs. Further, intralist response com- 
petition was maximized by including synonymous 
stimulus words in the list. The generalization be- 
tween these stimuli gave rise to associations be- 
tween nonpaired S-R units. 

Spence (1958) reasons that during the learning 
of the easy list the probability of the correct 
response being elicited by a stimulus is very high, 
while the probability of a stimulus eliciting an 
incorrect response is low. The multiplicative rela- 
tion between drive and habit therefore specifies 
that heightened drive will facilitate performance. 
On the other hand, on the difficult task many 
incorrect responses are associated with the stim- 
uli. Increasing the level of drive is expected to 
increase the amount of response competition, and 
cause decrements in the level of performance. 
Spence employs this analysis to argue that the 
data from the paired-associates studies confirm 
predictions derived from the Drive x Habit 
conception. 

The theoretical position of Spence et al. ig- 
nores the cognitive and affective processes which 
occur as the subject attempts to learn these easy 
and difficult lists. Tt is likely that the rapid learn- 
ing of the easy list indicates to the subject 
that he is succeeding. That is, on the basis of the 
frequent number of correct responses he may 
evaluate his performance positively. On the diffi- 
cult list the subject may perceive his performance 
as a failure, That is, because of the many incor- 
rect responses he may evaluate his performance 
negatively. Previous studies (Child & Whiting, 
1950; Katchmar, Ross, & Andrews, 1958; Lucas, 
1952; Mandler & Sarason, 1952; Sarason, 1957; 
Weiner, 1965a) have shown that highly anxious 
subjects exhibit relative increments in level of 
performance following success. Conversely, fail- 
ure depresses the subsequent performance of 
these subjects. For subjects low in anxiety, fail- 
ure has been demonstrated to enhance subsequent 
performance, while success produces later per- 
formance decrements, Thus the findings of Spence 
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et al. can be included among the research which 
has demonstrated interactions between individual 
differences in level of anxiety and the effects of 
success and failure on subsequent performance. 

This experiment was undertaken to determine 
whether the findings from the paired-associates 
studies cited above are really to be attributed to 
the differential reactions which high- and low- 
anxious individuals exhibit to success and failure 
experiences. To ascertain this, the inherent rela- 
tion between the easy’ task-success experience 
and difficult task-failure experience was experi- 
mentally severed, Subjects learning an easy list 
of paired associates were told that they were 
performing poorly relative to others. In this man- 
ner the easy task was paired with a failure 
experience, Subjects learning a difficult paired- 
associates task were told that they were doing 
well relative to others, thus pairing the difficult 
task with a success experience. If the differen- 
tial reactions to success and failure experiences 
are the essential determinants of behavior in this 
situation, then on the easy task highly anxious 
subjects experiencing failure should perform 
worse than subjects low in anxiety experiencing 
failure. On the difficult task, highly anxious sub- 
jects experiencing success should perform better 
than subjects low in anxiety experiencing suc- 
cess. The experimental design consequently pro- 
vides a definitive test of the alternative explana- 
tion of the Spence et al. data. 


METHOD 


One hundred and ninety-five male students en- 
rolled in the introductory psychology course at the 
University of Minnesota were given a group admin- 
istration of a Thematic Apperception Test (TAT), 
Picture Series 2, 48, 1, 7, 11, and 24 (Atkinson, 1958), 
and the Mandler-Sarason Test Anxiety Questionnaire 
(TAQ; Mandler & Sarason, 1952). The TAT was 
scored for n Achievement according to the method of 
content analysis described in Atkinson (1958). The 
correlation between the TAT and TAQ was r= —07. 
Following the current procedure (Atkinson, 1964), 
scores on the two measures were converted into 
Z scores and score on the TAQ was subtracted 
from score on the TAT, yielding an index of re- 
sultant (approach-avoidance) achievement motiva- 
tion. Subjects scoring in the upper and lower 25% of 
this combined distribution were selected for the 
experiment. Of the 98 subjects selected to appear for 
the second experimental hour, 62 were tested. 


Procedure 


The experimental procedure followed as closely as 
possible that described in Spence, Farber, and Mc- 
Fann (1956). Subjects had six trials of practice on à 
15-unit paired-associates list of nouns, The condi- 
tions of administration of the practice list were 
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relaxed. (McClelland, Atkinson, Clark, & Lowell, 
1953) to minimize subjective success or failure at 
that task. Subjects were told that the purpose of 
the practice trials was to familiarize them with the 
‘apparatus and procedure. Subjects with 0 or more 
than 50 correct anticipations during the practice list 
were eliminated from the data analysis. Only one 
subject had to be eliminated because of his perform- 
ance on the practice list. 

Following the practice trials the experimenter 
said: . 


Starting on this trial the practice is over, and 
your performance does count. From now on your 
performance will be assessed and compared to the 
performance of other college students. We have 
found that people differ in their ability to learn 
this list, and that students who do well at this 
are able to perform well at a variety of other 
tasks. So your performance will be a good indi- 
cator of your general ability. 

Many students wonder how well they are doing 
‘on the list. So for your own information I will 
stop the machine every few trials and tell you how 
many mistakes (correct answers) you made and 
how you are doing as compared to other college 
students, 


Thus the conditions of test administration corre- 
sponded to those described as achievement oriented 
(McClelland et al, 1953). The subjects then at- 
tempted to learn an easy or difficult paired-associ- 
ates list of adjectives (Spence, Taylor, & Ketchel, 
1936). 

Success and failure were manipulated by employ- 
ing false norms. In the easy list-failure condition the 
task was introduced as very easy. Subjects were 
told that, 


The average college student learned this list 
within four trials, so you should have very little 
difficulty mastering it. 


Following every even-numbered trial subjects were 
told how many incorrect responses (x) they actually 


, made, and that, 


Most college students at this point are making 
only 4x mistakes. 


This was done as efficiently as possible to minimize 
the time interval between trials. In the difficult list- 
Success condition the task was introduced as very 
difficult. The subjects were told that the average 
college student required 30 trials to learn the list. 
On every third trial starting with Trial 2 the sub- 
Jects received feedback on how many correct re- 
sponses (x^) they made. Subjects also were told that, 


Most college students at this point are making 
only 4x’ correct responses. 


In cases where odd numbers of correct or incorrect 
responses were made, the experimenter rounded the 
Teported norms downward. When the number of 
errors increased between trials, the reported norm 
Was the number given on the previous trial. The 
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Fic. 1. Percentage of correct responses as a func- 
tion of resultant achievement motivation and failure 
at an easy task or success at a difficult task. 


lists were presented at a 2:2 rate on a Hull-type 
memory drum. The criterion was two successive 
trials without an error. 
T RESULTS 

In Figure 1 the mean percentage of correct 
anticipations for the four experimental groups are 
plotted for the first 12 trials. Inspection of 
Figure 1 reveals that on the easy list where fail- 
ure was experienced subjects high in resultant 
achievement motivation performed better than 
subjects low on this measure. On the difficult list 
where success was experienced, subjects low in 
resultant achievement motivation performed bet- 
ter than subjects high on this measure. An analy- 
sis of variance of the number of trials required 
to reach criterion indicates that there is a sig- 
nificant interaction, F=8.06, df=1/57, p 
<.01, between motive groups and experimental 
conditions. Further analysis reveals that the 
difference in performance between the groups 
who experience failure at the easy task is sig- 
nificant, t£ = 2.57, df = 26, p<.02. Among sub- 
jects who experience success at the difficult task, 
the difference in performance between the two 


TABLE 1 


Mean NUMBER OF TRIALS TO CRITERION AMONG 
SUBJECTS SCORING IN THE UPPER AND LOWER 
25 PERCENTILE ON THE TAQ 


Condition 


Easy list (failure) Difficult list (success) 


Anxiety 
N M SD N M SD 
High 9 | 9.55 | 378 | 12 |1483| 6.57 
Low 12 | 708 | 1.85 | 10 |20.10| 7.35 
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motive groups also approaches statistical signifi- 
cance, £— 2.02, df = 31, p<.10. 

In the studies conducted by Spence et al. the 
Manifest Anxiety (MA) scale was used to clas- 
sify individuals into motivational groups. Ac- 
cordingly, in this study subjects were classified 
as high or low in anxiety. 

Of the 61 subjects tested 43 were in the upper 
or lower 2596 of the total distribution on the 
TAQ. Table 1 shows the mean number of trials 
to reach criterion for these four experimental 
groups. The pattern of results is identical to that 
disclosed when resultant achievement motivation 
was employed to classify subjects into motive 
groups. An analysis of variance indicates that 
there is a significant interaction, F — 5.00, df 
—1/39, p<.05, between level of anxiety and 
the experimental conditions. Further analysis re- 
veals that among subjects who experienced suc- 
cess while learning the difficult list, those classi- 
fied as highly anxious performed better, ¢ = 1.79, 
df= 20, p<.10, than subjects low in anxiety. 
On the easy task in which failure was experi- 
enced, subjects high in anxiety performed worse, 
t= 1.90, df —19, p < .10, than subjects low in 
anxiety. 


Discussion 


These results strongly suggest that it is er- 
roneous to cite prior research in this area as vali- 
dating evidence for drive theory. The important 
determinants of behavior in this situation are the 
cognitive and motivational consequences result- 
ing from success or failure at the task, rather 
than the individual’s drive level interacting with 
the structure of the task per se. 

In defense of drive theory it might be argued 
that in this study the MA scale was not used to 
assess individual differences in level of drive. 
However, the MA scale and TAQ are significantly 
related; the reported correlations range from .32 
(Alpert & Haber, 1960) to .53 (Raphelson, 
1957). Further, the drive mechanism has been 
hypothesized to be an internal emotional response 
(Spence, 1958), and the TAQ is a better pre- 
dictor of changes in skin conductance in test situ- 
ations than is the MA scale (Raphelson, 1957). 
Spence (1964) also has recently emphasized the 
necessity of environmental determinants in the 
arousal of anxiety; the TAQ is a measure of the 
tendency to respond with anxiety in test situ- 
ations. It therefore must be concluded that the 
TAQ is an adequate measure of anxiety (drive). 

The results do provide further validity for a 
model proposed by Weiner (1965a) which elabo- 
rates ideas originally formulated by Atkinson 
(1957) and Atkinson and Cartwright (1964). (The 
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reader is directed to Atkinson, 1964, for an ex- 
pansion and clarification of the conceptions pre- 
sented by Atkinson, 1957, and Atkinson and 
Cartwright, 1964.) Atkinson (1957) proposed 
that the direction of the tendency to strive for 
achievement-related goals is determined by the 
strength of the disposition to achieve success 
(Ms) relative to the strength of the disposition 
to avoid failure (M4r). When Mg > M y, the 
resultant achievement-oriented tendency is posi- 
tive, and the individual will tend to approach 
achievement-related goals. When M 4p > Msg, the 
resultant tendency to undertake achievement- 
related activities is negative (inhibitory), and the 
individual will tend to avoid these activities. That 
is, he is "motivated not to perform an act which 
might have, as a consequence, failure [Atkinson, 
1964, p. 245]." In addition to individual differ- 
ences in Mg and Mr, Atkinson includes asso- 
ciative and incentive components among the de- 
terminants of achievement-oriented behavior. 

The conception presented by Atkinson re- 
cently has been modified by Atkinson and Cart- 
wright (1964). They suggest that previously 
aroused but unsatisfied motivation persists fol- 
lowing nonattainment of a goal. To incorporate 
persisting motivational tendencies into Atkinson's 
earlier theory of achievement motivation, At- 
kinson and Cartwright postulate that the tend- 
ency to undertake an achievement-related task 
is determined by the magnitude of the tenden- 
cies to approach or avoid that task, plus the 
magnitude of the previously aroused and persist- 
ing motivation. 

If the ideas of Atkinson and Cartwright are 
expanded to include the notion that both the 
tendency to approach success and the tendency to 
avoid failure persist following nonattainment of 
a goal (Weiner, 1965a), then the theory specifies 
that following failure differential reactions will 
be exhibited by individuals high (Mg >Mar) 
or low (Myr > Mg) in resultant achievement 
motivation, When Mg Map, the resultant 
achievement-oriented tendency is positive; the 
resultant persisting motivation following non- 
attainment of a goal therefore is positive. Conse- 
quently, these individuals should exhibit incre- 
ments in level of performance following failure. 
When Map > Msg, the resultant achievement- 
oriented tendency is negative; the resultant per- 
sisting motivation following nonattainment of à 
goal is therefore inhibitory. These individuals 
should exhibit decrements in level of perform- 
ance following failure. This conception accounts 
for the data in this study indicating that follow- 
ing failure individuals in whom Map > Ms per 
form worse than individuals in whom Mg > Mar: 
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The theory presented by Atkinson and Cart- 
wright focuses upon the consequences of non- 
attainment of a goal, while the effects of goal 
attainment on subsequent performance are not 


‘stressed. If it were postulated that success re- 


duces the magnitude of the persisting goal- 
oriented tendencies, then the model could incor- 
porate data indicating that goal attainment has 
substitute value (Henle, 1944; Lissner, 1933; 
Mahler, 1933). Yet this conception is insufficient, 
for success has instigative as well as substitutive 
properties. As a general hypothesis, it is suggested 
that following goal attainment there is a decre- 
ment in the magnitude of both the persisting 
tendency to approach success and the persisting 
tendency to avoid threat of failure. The diminu- 
tion in the strength of these tendencies is propor- 
tionate to the strength of motivation sustaining 
the original activity. When Mg >M 4r, the dec- 
rement in the magnitude of the persisting ap- 
proach motivation is greater than the reduction 
in the magnitude of the persisting avoidance mo- 
tivation. Consequently, these individuals should 
exhibit decrements in level of performance fol- 
lowing success. When M4; > Ms, the decrement 
in the strength of the persisting avoidance tend- 
ency is greater than the reduction in the strength 
of the persisting approach tendency. These indi- 
viduals should exhibit increments in level of 
performance following success. This conception 
accounts for the differential reactions which sub- 
jects high or low in resultant achievement moti- 
vation exhibit following success experiences. 
These ideas have been used effectively to ex- 
plain paradoxical findings in the areas of substi- 
tution and psychotherapy (Weiner, 1965b), and 
may prove helpful in reconciling other seemingly 
contradictory data. For example, aggressive fan- 
tasy goal attainment sometimes has cathartic 
value and reduces subsequent aggressive behavior 
(Berkowitz, 1964), and at times leads to model- 
ing behavior and a subsequent increase in aggres- 
siveness (Bandura & Walters, 1963). These op- 
Posite findings may be due to differences in the 
motivational characteristics of the samples tested. 
Given overtly aggressive children, as might be 
found in the lower class, it is hypothesized that 
fantasy goal attainment will have cathartic value, 
inasmuch as the magnitude of the approach tend- 
ency is greater than the magnitude of the avoid- 
ance tendency. On the other hand, given a sample 
in whom aggressive responses are inhibited, as 
might be found in middle-class children, fantasy 
goal attainment would be expected to result in a 
Subsequent increment in the tendency to act ag- 
gressively. In that sample the magnitude of the 
avoidance tendency is greater than the magnitude 
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of the approach tendency. Similarly, if an aggres- 
sive action is thwarted because of fear of retali- 
ation, lower-class children would be expected to 
exhibit subsequent increments in overt aggres- 
sion, while middle-class children should exhibit 
subsequent decrements in aggressive expression. 
That is, there would be an interaction between 
the effects of attainment and nonattainment of 
a goal and the strength of the tendencies to ap- 
proach and avoid the goal object. 
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EXPECTATIONS OF SOCIAL ACCEPTANCE AND COMPATIBILITY AS 
RELATED TO STATUS DISCREPANCY AND SOCIAL MOTIVES 


ROBERT B. BECHTEL ann HOWARD M. ROSENFELD 1 


University of Kansas 


Hypotheses derived from an expectancy-value conception of interpersonal 
choice processes were tested in a sample of college dormitory women. All 159 
Ss were assessed for motives related to affiliation and achievement (TAT n 
Affiliation and n Achievement, self-report rejection anxiety and test anxiety), 
and for social status. After being falsely informed that they were of average 
status, random subsets of Ss either selected new roommates from among 10 
status levels, or estimated their chances of acceptibility or compatibility at 
each level. As predicted, status discrepancy was negatively related to estimates; 
and Ss typically chose above their own status. Unanticipated status by motive 
interactions, and differences between acceptance and compatibility estimates, 
were interpreted in terms of approach and avoidance mechanisms. 


Expectancy-value theories have been employed 
widely and successfully in research on achieve- 
ment-related behavior (Atkinson, 1958; McClel- 
land, Atkinson, Clark, & Lowell, 1953), but little 
evidence has been accumulated by which to 
evaluate their usefulness in the study of afülia- 
tive behavior. In a laboratory experiment on the 
selection of task partners among male high-school 
seniors, Rosenfeld (1964) tested several deriva- 
tions from an expectancy-value conception of 
interpersonal choice. His findings revealed that 
persons who differ from the subject in social 
desirability (ie. value) are perceived by him to 


l'The authors are grateful for assistance provided 
by the Bureau of Child Research and residence hall 
officials at the University of Kansas. Portions of this 
report, based upon the master's thesis of the first 
author, were read at the 1964 convention of the 
Midwestern Psychological Association. 


be low in availability (i.e., expectancy) as task 
partners, and that the choice of a task partner 
tends to be a compromise between social desir- 
ability and availability. (To determine whether 
these results can be generalized to interpersonal 
choice processes as they commonly occur in S0- 
ciety, the present study examines the above find- 
ings under more informal, less task-oriented con- 
ditions. Under these conditions it is assumed 
that high social status is an appropriate quality 
by which to judge social desirability. 

Several additional aspects of the interpersonal 
Choice process were investigated in the present 
study. In the Rosenfeld study, as well as most 
other laboratory experiments on interperso 
choice, subjects may have been aware that any 
newly-formed relationships would be limited to 
the relatively short duration of the study. Thus 
it is possible that considerations of long-term 
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compatibility were sacrificed for the attainment 
of immediate acceptance. The current study in- 
vestigates the operation of both acceptance and 
compatibility as goals in the selection of rela- 
tively long-term associates in a persistent "nat- 
ural” social system. 

The effects of social desirability and expecta- 
tions of acceptance and compatibility in the selec- 
tion of associates are likely to be influenced by 
individual differences in the motives of subjects. 
In the application of the “Motive X Expectancy 
X Incentive" formula to achievement behavior, 
expectancy and incentive are viewed as inversely 
related properties of the environment, while 
motive is considered an independent intrapersonal 
force, However, Atkinson (1957) cites evidence 
that subjects high in achievement motivation (n 
Achievement) and low in fear of failure (FF) 
have higher subjective expectations of succeed- 
ing on a task of given difüculty than do those 
who are low in n Achievement and high in FF. 
The present study attempts to determine whether 
n Achievement and FF among women might be 
related in similar fashion to expectations of inter- 
personal success. 

It is also possible that analogous relationships 
may obtain between affiliative motivation and 
the subjective expectation of social success. Ros- 
enfeld (1964) found that need for affiliation (n 
Affiliation) among men was directly related to 
preferences for competent task partners. Perhaps 
n Affiliation, labeled an approach tendency by 
Atkinson, Heyns, and Veroff (1954), affects in- 
terpersonal choice through heightened expecta- 
tions of interpersonal success. Fear of rejection 
(FR), the negative counterpart of n Affiliation, 
should be negatively related to such expectations. 

On the basis of the above discussion and the 
findings of Rosenfeld’s (1964) study several 
hypotheses were derived and tested. Rosenfeld 
found that to the degree potential task partners 
deviated from a subject in competence, they were 
perceived by the subject to be less willing to 
associate with him as a task partner. A compa- 
rable prediction is made in Hypothesis 1: Persons 
estimate that their likelihood of being (2) ac- 
cepted by and (b) compatible with others de- 
creases as the social status of others differs from 
their own. 

Another assumption in Rosenfeld's study was 
that persons are generally motivated to associate 
with others who surpass them in socially desirable 
qualities, even though this tendency was thought 
to be partially inhibited because of the lower 
subjective availability of superior partners. The 
finding that subjects typically attempted to form 
Partnerships with peers who were somewhat su- 
perior in competence forms the basis for Hy- 
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pothesis 2: Persons in general attempt to associ- 
ate with others who surpass them in social status. 

The remaining hypotheses are derived from the 
assumption that expectations of social success 
are positively related to individual differences in 
approach motivation and negatively related to 
individual differences in avoidance motivation. 
Both achievement- and affiliation-related motives 
are considered. Hypothesis 3 states that expecta- 
tions of acceptance and compatibility are posi- 
tively related to: (a) n Affiliation and (b) n 
Achievement. Hypothesis 4 states that expecta- 
tions of acceptance and compatibility are nega- 
tively related to (2) FR and (6) FF. 


METHOD 


Subjects were 159 freshman women residents of 
a university dormitory. Female subjects were used 
because women have exhibited higher scores on ques- 
tionnaires of values related to social status (Hyman, 
1953) and because they were expected to show 
greater concern Over interpersonal choice than men. 
To eliminate volunteer bias, four sections of the 
dormitory were selected at random. 

In the first of two sessions held in the dormitory 
cafeteria, all subjects were administered tests of the 
four motives listed in Hypotheses 3 and 4, TAT 
measures of n Affiliation and n Achievement were 
obtained by a male experimenter under standard 
procedures described in Atkinson (1958, Appendix 
III. The six TAT pictures were identical with 
those used in a nationwide survey (Veroff, Atkinson, 
Feld, & Gurin, 1960). n Achievement and n Af- 
filiation were scored by separate raters, each of 
whom had previously established high reliability on 
practice stories provided in Atkinson (1958). It 
should be noted that attempts to arouse TAT n 
‘Achievement in females have not generally been suc- 
cessful (Lesser, Krawitz, & Packard, 1963; Mc- 
Clelland et al., 1953), so that the propriety of the 
measure in the present study is open to question. 
However, the present set of TAT pictures has been 
used successfully to assess aroused n Affiliation in 
freshman university women (Rosenfeld & Franklin, 
1966), indicating its validity for use as the measure 
of n Affiliation in this study, A short, reliable form 
of the Mandler-Sarason Test Anxiety Questionnaire 
(Mandler & Sarason, 1952) was used to assess FF, 
and a similarly constructed scale developed by Rosen- 
feld (1964) was used to assess FR. It should also be 
noted that the FR test had not previously been 
applied to female subjects. 

A face valid Campus Social Status Test, de- 
veloped in cooperation with dormitory administrators 
and counselors, was also administered at the first 
session to make credible the assignment of social 
status levels in the second session. The 16-item 
test included inquiries into grade average; car owner- 
ship and model; educational attainment, occupation, 
and income of parents; social group membership 
and offices; kinds of social activities attended; and 
number of dates per week. Two days later the second 
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session was held and each subject was assigned Level 
5 on the 10-level social status scale, ostensibly on 
the basis of her answers to the Campus Social Status 
Test. Instructions, publicly endorsed by legitimate 
authorities, informed subjects that this was a 
study concerned with devising better methods of 
roommate assignment and that they would be re- 
quired to select new roommates with whom they 
would actually live for the 6 weeks remaining in 
the semester. 

‘A random group of 100 subjects was drawn from 
the sample and each subject was instructed to select 
a status level from which she would be assigned 
a new roommate. This group will be referred to as 
Group I. Group I made its selection of status levels 
from the 1-10 scale, 1 indicating the highest level 
of social status, and 10 indicating the lowest. Group 
II, a random group of 27 subjects, made estimates 

. of the likelihood they would be acceptable as room- 
mates to persons at each of the 10 status levels. 
They were informed, prior to making estimates, that 
they would subsequently have a brief meeting with 
a person from each one of these levels, after which 
the accuracy of their estimates would be assessed. 

A third random group of 32 subjects, Group III, 
estimated the likelihood that they would be com- 
patible with roommates at each of the 10 social 
status levels. They were informed, “By using IBM 
machine analysis we were able to determine the 
degree of compatibility between you and each of 
10 persons from another dormitory. The persons 
were selected from each of the 10 social status levels. 
We want to see how accurately you can guess the 
likelihood you will be compatible with persons from 
each level.” Groups II and III made their estimates 
in terms of a fraction showing “chances in a hun- 
dred.” Both groups were told their estimates would 
not be used by the experimenter to influence the 
roommate they later chose. 

After each group made its respective choices or 
estimates, a postexperimental questionnaire was ad- 
ministered, primarily to assess the degree of prefer- 
ence by each subject for roommates from each of 
the 10 status levels. Then the experimenter revealed 
to subjects that no roommate changes would actually 
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take place and carefully explained the real purpose 
of the experiment.? 


RESULTS 


Figure 1 shows subjects’ mean expectations of 
acceptance from and compatibility with room- 
mates from each status level. As predicted in 
Hypothesis 1, subjects expected less acceptance 
and less compatibility to the degree that the 
status level of a potential roommate differed 
from their own. F tests for acceptance expecta- 
tions across status levels and for compatibility 
expectations across status levels were significant 
(p<.01 and p<.001, respectively). Subjects 
expected that persons at status levels either 
above or below their own level would be less 
accepting and less compatible (p «.01 by f test 
for each comparison).* 

A comparison of the two distributions shown 
in Figure 1 also showed a significant interaction 
(p «.01) between type of expectation (accept- 
ance versus compatibility) and social status 
level. High status persons were seen to be sig- 
nificantly more likely to be compatible than ac- 
cepting (p <.05 at Status Levels 2, 3, and 4), 
while low status persons were seen as more likely 
to be accepting than compatible (p<.05 at 
Status Levels 7, 8, and 9). 

In confirmation of Hypothesis 2, the 100 sub- 
jects in Group I selected roommates whose 
mean status level (4.01) was significantly higher 
than their own (p <.01). 

In the postexperimental questionnaire all sub- 
jects stated on a 10-point scale their preference 
for potential roommates at each of the 10 social 
status levels. There were no significant differ- 
ences between the preference scores of all three 
experimental groups, so they were combined. The 
total distribution curve for preference scores was 


almost exactly parallel to the compatibility €x- 


pectation curve in Figure 1. Further analyses of 
the expectation and preference curves reveale 
that preferences stated on the postexperimental 
questionnaire were correlated to a significantly 
greater degree with compatibility expectations 
than with acceptance expectations (p< 01). 
Hypotheses 3 and 4 were tested by analysis of 
variance using the particular motivational test 
(either n Affiliation, n Achievement, FF, or FR) 
and status level as independent variables. None 


2 Responses to the postexperimental questionnaire 
and to the revelation of the purposes of the expen 
ment strongly indicated that, with the exception of 
the claimed ability to match persons for compati- 
bility by IBM analysis, the experimental inductions 
were credible to the subjects. 

3 All p levels in this report are for two-tailed tests. 
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of the measures of motivation was significantly 
related to expectations of acceptance, either as a 
main effect or in interaction with status levels 


„of subjects, Two motivational tests—n Affiliation 


and FF—were related to compatibility expecta- 
tions. Compatibility expectations of subjects high 
in n Affiliation were significantly higher (p < 
.01) than those of subjects who scored low in 
the n Affiliation test (see Figure 2). Thus, Hy- 
pothesis 3a was confirmed by one criterion— 
compatibility. However, the significant differences 
in Figure 2 were confined to the lower part of 
the status dimension (p < .05 at Levels 5 and 8; 
b «.01 at Levels 6 and 7). 

Contrary to Hypothesis 4b, subjects high in 
FF stated higher (p < .10) expectations of com- 
patibility than did subjects low in FF (see Fig- 
ure 3). Similar to the n Affiliation results, a sig- 
nificant interaction between motive and status 
was obtained (p <.01), with reliable differences 
limited to the lower end of the status scale (p< 
.05 at Status Levels 7 and 8 in Figure 3). It 
should be added that none of the intercorrela- 
tions among the four measures of motivation was 
significant and all were numerically close to zero. 


DISCUSSION 


The curvilinear relationships between status 
discrepancy and expectations of acceptance and 
compatibility are comparable to the results of 
Rosenfeld's (1964) study in which males antici- 
pated that persons more similar to themselves in 
task competence would be more available to 
them as task partners. A second similarity is that 
the average subject in both studies chose a person 
approximately one level above himself on the 
relevant dimension of social desirability—higher 
status or competence. This last finding is con- 
sistent with the assumption that persons are 
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Fic. 3. Expectation of compatibility as a function of 
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of social attainment are seen as available to 
them. 

The differences between expectations of ac- 
ceptance and compatibility were not anticipated. 
An interpretation of the differences between the 
acceptance and compatibility curves can be de- 
rived from Miller's (1944) theory of approach 
and avoidance gradients. Slightly over one half of 
the subjects stated in the postexperimental ques- 
tionnaire that they were dubious about the ex- 
perimenter’s claim that he could determine com- 
patibility by IBM analyses of personality test 
scores. Thus, while subjects anticipated that 
their expectations of acceptance would be shortly 
verified, it is likely that they felt compatibility 
could not be determined until much later. If en- 
counters involving a question of acceptance were 
likely to be closer than those involving compati- 
bility, the avoidance (of interpersonal failure) 
gradient should have been higher when making 
estimates of acceptance than of compatibility. In 
other words, realistic anticipations of failure may 
have been more salient to judgments of accept- 
ance, while fantasies of success were more likely 
to affect judgments of compatibility. 

Since it was also found that expectations of 
compatibility were more highly correlated with 
preference scores than were expectations of ac- 
ceptance, the use of compatibility expectations 
in the prediction of long-term interpersonal 
choices would seem more potent than acceptance 
expectations. Future experiments in interpersonal 
choice cannot ignore the significance of the 
differences between these two kinds of expecta- 
tions. 

As predicted, subjects high in n Affiliation ex- 
pected significantly higher compatibility than did 
subjects low in n Affiliation; but this difference 
was significant only with respect to low status per- 
sons. If social approval is gained by associating 
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with high status persons and n Affiliation is a 
motive to obtain social approval, one would ex- 
pect subjects high in n Affiliation to have higher 
expectations of compatibility from high status 
persons than would subjects low in n Affiliation. 
An additional unanticipated finding was that sub- 
jects high in n Affiliation indicated on the post- 
experimental questionnaire that they were sig- 
nificantly less attracted to low status persons 
than were subjects low in n Affiliation ( < .01 
at Status Levels 6-10, respectively). 

The paradoxical behavior of subjects high in n 
Affiliation may be explained by considering pre- 
vious research which indicates that such subjects 
seek approval, but that they tend not to receive 
it. While Atkinson et al. (1954) presented evi- 
dence that n Affiliation is a tendency to engage 
in approach behavior, they also found that sub- 
jects higher in n Affiliation were rated lower in 
sociometric status by peers. Similarly, Groesbeck 
(1958) found that subjects higher in n Affilia- 
tion were rated lower as intimate friends by their 
peers. Rosenfeld (1964) found that subjects 
higher in n Affiliation attempted more to estab- 
lish relationships with peers who surpassed them 
in competence, but withdrew at the first sign of 
nonacceptance. If it is assumed that subjects high 
in n Affiliation gain some awareness of their 
relative undesirability yet wish to overcome it, 
the results of the present study can be inter- 
preted. 

Subjects higher in n Affiliation expect more 
compatibility with lower status persons because 
they perceive that the low status persons are 
similar to themselves in undesirability. Yet, they 
wish to overcome their undesirable positions. 
Hence, to appear available and desirable to 
higher status persons, they try to avoid associ- 
ation with lower status, In short, the high n Affili- 
ation subjects realistically estimated their chances 
of compatibility with lower status persons as 
being greater than did subjects low in n Affilia- 
tion but they attempted more to avoid the stigma 
of lower status associations. It would appear 
very much as though to high n Affiliation sub- 
jects, higher status persons are a positive ref- 
erence group and lower status persons are a 
negative reference group as Newcomb (1950, p. 
226) defines these terms. 

Subjects high in FF also expected greater 
compatibility than did subjects low in FF, but 
again only with lower status persons. According 
to Hypothesis 4b, however, FF should have been 
negatively related to expected compatibility. 
Hypothesis 4b was based on studies of the sub- 
jective difficulty of relatively impersonal tasks 
(Atkinson, 1957). One source of difficulty in 
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attaining a compatible interpersonal relationship 
is the subjects inability to make himself at- 
tractive to others, If the subject high in FF 
perceives that like the low status person he is 
unable to attract high status persons, he may 
expect the low status person to accept him by 
“default.” 

The measures of n Affiliation and FF, although 
usually applied to male subjects, thus seem to be 
indicators of individual. differences that are im- 
portant determinants of female social behavior. 
The findings that n Affiliation and FF are related 
to expectations of compatibility but not accept- 
ance may indicate that subjects who score high 
in these variables are more concerned over their 
ability to persist in social relationships than their 
ability to establish a positive first impression. 

The fact that n Achievement was not related 
to expectations in this study might be due either 
to the inapplicability of the present measure to 
female subjects or to the irrelevance of n 
Achievement to the social goals of women. Pos- 
sible reasons for the lack of relationship between 
FR and expectations are not readily apparent. 

The general outcome of the present study in- 
dicates that not all of the principles of expect- 
ancy-value theory that have been developed in 
the area of achievement behavior are directly 
applicable to affiliative behavior. Such compari- 
sons are useful, however, in revealing which 
principles can be generalized across content areas 
and in pointing out discrepancies that require 
reconceptualization and further research. The 
present study indicates that in the determination 
of interpersonal choice two functionally separate 
forms of expectation operate—acceptance and 
compatibility. This bifurcation of expectancy re- 
vealed totally unexpected behavior on the part 
of subjects high in n Affiliation and FF. 


REFERENCES 


ATKINSON, J. W. Motivational determinants of risk- 
taking behavior. Psychological Review, 1957, 64; 
359-372. 

ArxiwsoN, J. W. (Ed.) Motives in fantasy, action, 
and society. Princeton, N. J.: Van Nostrand, 1958. 

ArxiwsoN, J. W, Heys, R. W., & Verorr, J. The 
effect of experimental arousal of the affiliation 
motive on thematic apperception. Journal of Ab- 
normal and Social Psychology, 1954, 49, 405-410. 

GROESBECK, B. L. Toward description of personality 
in terms of configurations of motives. In J. W. 
Atkinson (Ed.), Motives in fantasy, action, and 
society. Princeton, N. J.: Van Nostrand, 1958. 
Pp. 383-399. 

Hyman, H. H. The relation of the reference group 
to judgments of status. In R. Bendix & S. M. 
Lipset (Eds.), Class, status, and power. Glencoe, 
Ill.: Free Press, 1953. Pp. 263-270. 


aÁ Ma t MM | pto i 


; 


BRIEF ARTICLES 


Lesser, G. S., Krawitz, Ruopa N., & PACKARD, Rrra. 
Experimental arousal of achievement motivation 
in adolescent girls. Journal of Abnormal and Social 
Psychology, 1963, 66, 59-66. 

MANDLER, G., & Sarason, S. B. A study of anxiety 
and learning. Journal of Abnormal and Social Psy- 
chology, 1952, 47, 166-173. 

McCrzrLANp, D. C., ATKINSON, J. W., CLARK, R. A, 
& Lowett, E. L. The achievement motive. New 
York: Appleton-Century-Crofts, 1953. 

Muer, N. E. Experimental studies of conflict be- 
havior. In J. McV. Hunt (Ed.), Personality and 
the behavior disorders. Vol. 1. New York: Ronald 
Press, 1944. Pp. 431-465. 


Journal of Personality and Social Psychology 
1966, Vol. 3, No. 3, 349-353 


349 


Newcome, T. M. Social psychology. New York: 
Holt, Rinehart, & Winston, 1950. 

RosENFELD, H. M. Social choice conceived as a level 
of aspiration. Journal of Abnormal and Social 
Psychology, 1964, 68, 491-499. 

Rosenretp, H. M., & Franxuin, S. S. Arousal of 
need for affiliation in women. Journal of Personal- 
ity and Social Psychology, 1966, 3, 245-248. 

Verorr, J., Atkinson, J. W., Fetp, Sarma, C, & 
Gorm, G. The use of thematic apperception to 
assess motivation in a nationwide interview study. 
Psychological Monographs, 1960, 74(12, Whole No. 
499). 

(Received September 28, 1964) 


A TEST OF A MODEL FOR COMMITMENT 


CHARLES A. KIESLER? 
Yale University 


AND 


JOSEPH SAKUMURA 
Ohio State University 


Commitment is defined as a binding of the individual to behavioral acts, and a 
theoretical model is presented for the role of commitment in attitude change. 
The derivation tested here is: the greater the inducement offered S for perform- 


ing an „act consistent with his beliefs, 


the less committed he is to that act, 


and the less the resistance to subsequent countercommunications. Ss were 


differentially paid for performing an 


Later all Ss received a strong counter 


act consistent with their prior beliefs. 
communication on the same topic. The 


hypothesis was confirmed: Ss receiving the greater payment for performing 


the consonant act 


There is a rapidly increasing body of evidence 
indicating that a thorough understanding of the 
role of commitment is a necessary requisite to an 
adequate theory of attitude change. Perhaps 
Brehm and Cohen (1962), in their discussion of 
dissonance theory (Festinger, 1957) have voiced 
the most explicit interest in commitment as a 
variable, but other authors (e.8., Hovland, Janis, 
& Kelley, 1953; Secord & Backman, 1964; Sherif 
& Hovland, 1961) have also been concerned with 
the topic. Although theoretical interest has been 
increasing, there have been few efforts to deal 
directly with the concept experimentally. 

Recently, however, Kiesler? proposed a pre- 
liminary theoretical model for the role of com- 
mitment in attitude change. He defines commit- 
ment as a pledging or binding of the individual 


lThe data were collected while the senior author 
was at the Ohio State University. 

? "Psychological Commitment in Attitude Change,” 
Proposal submitted to the National Institute of Men- 
tal Health, United States Public Health Service, 1964. 


later showed greater attitude change in the direction 
advocated by the countercommunication. 


to behavioral acts, and discusses how commit- 
ment to specific behaviors will relate to attitudes 
and their resistance to change. Some of the basic 
assumptions and hypotheses are presented below. 

1. The individual attempts to resolve incon- 
sistencies between the attitudes he holds and be- 
havioral acts which he, for one reason or an- 
other, is induced to perform. This assumption is 
quite similar, if not identical, to the main assump- 
tion of the “consistency” models proposed by 
Festinger (1957), Heider (1958), and Osgood 
(1960), and hence is supported by evidence re- 
lated to those theories. We assume that to resolve 
the inconsistency, one may change either the at- 
titude or the act (including the psychological im- 
plications of the act). 

2. The effect of commitment is to make an act 
less changeable. Thus, utilizing Assumptions 1 and 
2, we may hypothesize: (a) if the act is consistent 
with the subject’s previous belief system then 
commitment to the act makes the subject more 
resistant to subsequent attack on his beliefs; (b) 
if the act is inconsistent with the subject’s previ- 
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ous belief system, then the commitment to the 
act acts as a force on the subject to change his 
attitudes towards consistency with the act. 

3. The magnitude of the effect of commitment 
is positively and monotonically related to the 
degree of commitment. Thus, the greater the 
commitment, the greater the resistance in 2a and 
the greater the attitude change in 2b. 

4. Finally, the assumption is made that the de- 
gree of commitment may be manipulated in sev- 
eral ways. We may hypothesize, for instance, that 
one may increase the degree of commitment by 
increasing one or more of the following: (a) the 
number of acts performed by the subject; (b) 
the importance of the acts for the subject; (c) 
the explicitness of the act, for example, how pub- 
lic or otherwise unambiguous the act was; (d) 
the degree of irrevocability of the act; (e) the 
degree of volition perceived by the subject in 
performing the act. In turn, we hypothesize that 
the degree of volition may be increased by: an 
increase in the degree of perceived choice in per- 
forming the act; a decrease in the degree of ex- 
ternal pressure exerted upon the subject to per- 
form the act. As implied above, perceived choice 
and external pressure are not independent phe- 
nomena. The former relates,to the phenomeno- 
logical world of the subject and the latter is re- 
lated to operations performed by the experi- 
menter. In addition, there is some evidence (cf. 
Cohen, 1960) that the two variables are inversely 
related to each other. Thus, increasing the degree 
of external pressure exerted upon the subject to 
perform a particular act should decrease his per- 
ceived choice in performing the act. 

Thus, the model assumes the subject is pledged 
or bound by the performance of an overt act. 
How "committing" a given act is for the sub- 
ject depends upon those variables listed above. 
The model further assumes that the subject at- 
tempts to resolve inconsistencies between be- 
havioral acts and attitudes. To resolve an in- 
consistency, the subject may change either his 
attitude, or distort or deny the meaning of the 
act, The greater the commitment, the less the 
subjects tendency to distort the act, and the 
greater the tendency to change an attitude in- 
consistent with the act. Commitment equal, if a 
given act is consistent with previously held at- 
titudes, the act increases the resistance of the 
subject to subsequent countercommunications. 
The greater the commitment, the greater is the 
resistance to change. If a given act is inconsistent 
with previously held attitudes, the inconsistency 
acts as a force on the subject to change his at- 
titudes to make them more consistent with the 
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act. The greater the commitment, the greater the 
force. 

The present experiment proposes to test one 
derivation from this model: namely, the less the 
pressure exerted on the subject to perform an 
act consistent with his beliefs, the greater his re- 
sistance to subsequent countercommunications. 
In the present case, subjects are induced to per- 
form an overt, explicit act consistent with their 
beliefs: half of the subjects are paid $5 for doing 
this; half are paid $1. At a later time, all sub- 
jects are presented with a strong communication 
contrary to their beliefs. Assuming that the de- 
gree of payment is operationally equivalent to the 
degree of pressure to perform the act, the pre- 
diction for the present experiment is clear: The 
less the payment for performing an act consistent 
with one’s beliefs, the greater the resistance to 
subsequent countercommunications. 


METHOD 


Two experiments were carried out with only 
minor procedural differences between the two. 
In addition, however, two control groups were 
included in the second experiment. The two ex- 
periments will be described in sequence. 


Experiment I 


Subjects. Forty-two male and female students in 
an introductory psychology course at Ohio State 
University volunteered for the experiment, entitled 
“Opinion Survey,” as a regular part of their course 
requirements. Five subjects were suspicious about 
the manipulations and were dropped: four in the $5 
condition and one in the $1 condition. The experi- 
menter met individually with each subject. 

Procedure. On arrival, subjects were told that the 
experimenter was actually a research assistant for 
two different faculty members, Drs. E and M, and 
that the experimenter would be gathering data for 
two different studies during the subject’s hour. (This 
is not unusual at Ohio State and subjects invariably 
accepted the information without comment.) Dr. E's 
material was to come first and the subjects Were 
asked to express their opinions on 12 items, each on 
a 7-point scale ranging from strongly agree to 
strongly disagree. The key item for the present ex- 
periment was, "The legal age of voting should be 
lowered to 18 years of age." This item had been 
found in previous work to produce a bimodal dis- 
tribution in the Ohio State population. 

The consonant act. Two communications had previ- 
ously been prepared, each representing a moderate-to- 
extreme position on the key issue. The experimenter 
covertly noticed whether the subject was pro or con 
on the key issue. The experimenter then assigned the 
subject the communication consistent with the sub- 
ject’s position and asked him to read it into a tape 
recorder, identifying himself by name. The subjects 
were told that Dr. E would pay them $1 or $5 for 
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doing this, depending on condition. The experimenter 
said: 


Dr. E is not only interested in opinion surveys, but 
also he is interested in the media of mass com- 
munication, or the dynamic interaction between 
speaker and the audience. He's especially interested 
in determining whether a person's geographical 
origin has a differential effect upon a listener. Are 
you from Ohio? [Usually, the subject said, “Yes.”] 
Dr. E wants to know whether your Midwestern 
voice has a differential effect upon a listener or 
listeners. For example, does a Southerner have a 
differential effect upon a listener? I was born in 
Sacramento, California. As a person from the Far 
West and from a different racial group [the ex- 
perimenter was a Japanese-American], Dr. E 
wanted my voice for demonstration purposes in the 
seminars which he conducts throughout the coun- 
try. He asked me to record your voice to be used 
for these demonstrations. Since Dr. E is paid for 
conducting these seminars, he told me to give you 
some remuneration . . . give you some funds... 
pay you some money. I don't think my voice is 
worth it, but he gave one dollar [five dollars] 
when I recorded my voice for him. I think he 
wants to give the money because he'll use the tape 
over and over again. It's like the T.V. commercials. 
You know that a person who tapes à commercial 
is not only paid for the initial taping, but also 
whenever it's used, Would you mind reading this 
communication into the tape recorder? Inciden- 
tally, don't be dramatic. Use your normal voice. 
If Dr. E wanted a dramatic voice, he could have 
gone to the Speech Department, or hired profes- 
sional actors. He wants ordinary voices like yours 
and mine to see how ordinary voices influence a 
listener. Are there any questions? 


After the subject had recorded the communication, 
the experimenter announced that was the end of Dr. 
E's study. He then attempted to separate the two 
portions of the experiment by providing the subject 
with a short break, signing the subject's credit card 
for the experiment, and making various irrelevant 
remarks before continuing with the experiment. 

Countercommunication. The experimenter an- 
nounced that Dr. M was gathering some preliminary 
data, that the experimenter was not quite sure what 
it was all about, since this was the first time it has 
been done and the experimenter had not had time 
to discuss the project with Dr. M. Consequently, he 
could not answer any questions. The experimenter 
gave the subject the second (counter-) communica- 
tion to read. He then gave the subject a “personal 
inventory” to fill out. The personal inventory had 
been duplicated by a different process than the previ- 
ous opinion survey, and the 13 jtems were cast in 
varying formats. The key item, of course, was re- 
peated in identical form. i 

Termination. As a method of ferreting out suspi- 
cion, subjects were asked if they had any ideas they 
would like to contribute on either experiment. The 
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experiment was then explained to them; they were 
sworn to secrecy and dismissed. 


Experiment II 


Procedural differences from Experiment I. There 
were only two procedural differences between Experi- 
ment I and Experiment II: the clause, “I don’t think 
my voice was worth it, but... was deleted from 
the experimenter's monologue; after the subject had 
agreed to read the first communication, he filled out 
a form, indicating his name, address, and the amount 
of money he was to be paid. This form was kept in 
front of the subject during the rest of the experiment. 

Control groups. Note that in both Experiments I 
and II, the sequential arrangement for the subject is: 
take prequestionnaire, read consonant speech into 
tape recorder, read countercommunication, take post- 
questionnaire. In both studies we would like to 
measure the resistance to the countercommunication. 
However, the difference between the pre- and post- 
questionnaires would include not only attitude change 
due to the countercommunication, but also any ef- 
fects which might occur as a result of reading the 
consonant communication (e.g., strengthening of be- 
lief). To check on this, two control conditions were 
included in Experiment II: one was paid $1 for 
reading the consonant communication; one was paid 
$5. After the communication had been read, they 
took the usual break, filled out Dr. M’s personal 
inventory, and were dismissed without ever reading 
the countercommunication. By proper comparisons 
with these control conditions, we may ascertain both 
the effect of performing the consonant act and the 
effect of reading the countercommunication. 


RESULTS 


The results for the attitude change obtained in 
experimental conditions are presented in Tables 
1 and 2. It may be seen from Table 2 that the 
procedural differences between Experiments I and 
II produced no difference in results (Winer, 1962, 
pp. 213-216), and we may confidently pool the 
results of the two experiments. It may be seen 


TABLE 1 


Mean ATTITUDE CHANGE TOWARD A COUNTER- 
COMMUNICATION AS A FUNCTION OF PRIOR 
DIFFERENTIAL PAYMENT FOR PER- 
FORMING A CONSONANT ACT 


Payment 
$1 $5 Total 
Experiment I 18 86 53 
(18) (19) (37) 
Experiment II .09 .65 f 
(14) (17) (31) 
Total 14 76) 
(32) (36) 


Note.—Ns in parentheses. 
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TABLE 2 


ANALYSIS OF VARIANCE OF ATTITUDE CHANGE 
SCORES FOR EXPERIMENTAL CONDITIONS 


Source SS df MS F 
Experiments (E) 370 1 370 | «10 
Payment (P 6.454 1 6.454 5.58* 
EXP .060 1 .060 | «1.0 
Within 73.970 | 64 1.156 — 

Total 80.854 | 67 
*p <.05. 


that in both experiments the predicted effect took 
place: subjects who received less payment for 
performing an act consonant with their beliefs, 
showed greater resistance to subsequent counter- 
communications, 

The results for the two control conditions are 
presented in Table 3. The two means are not 
statistically different from each other (t< 1.0), 
nor is either of them significantly different from 
zero (¢ < 1.0 in both cases). We may now com- 
pare the two experimental conditions against the 
control conditions. We find that the $5 experi- 
mental condition produced greater change than 
the $5 control condition (¢ = 2.47, df=51, p< 
.02), but that the $1 experimental condition did 
not produce significantly greater change than the 
$1 control condition (¢ = .66). Thus we may con- 
clude that the main hypothesis for the present 
experiment is supported: subjects who received 
less payment for performing an act consonant 
with their beliefs, showed greater resistance to 
subsequent countercommunications. In addition, 
we may not conclude that the payment itself and 
the consonant act had any differential or absolute 
effect on the subjects’ attitude. 


Discussion 


The main hypothesis for the present experi- 
ment was supported: subjects who received less 
payment for performing an act consonant with 
their beliefs showed greater resistance to sub- 
sequent countercommunication than subjects 
whose payment was greater. Aside from the sup- 
port for the model presented here, what are the 
implications of this finding? Let us look at some 
recent work with theoretical ties. 

Recently, Festinger and Carlsmith (1959), 
working within the framework of Festinger’s 
(1957) dissonance theory, performed an experi- 
ment in which subjects were induced, for differ- 
ential payment, to perform an act discrepant or 
dissonant with their beliefs. They found that the 
less the payment, the greater the attitude change 
towards consistency with the act. One may note 


8 All ¢ tests are two-tailed. 
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TABLE 3 


MEAN ATTITUDE CHANGE AS A FUNCTION OF Dir- 
FERENTIAL PAYMENT FOR PERFORMING A 
Consonant Act (CONTROL CONDITIONS) 


Payment 
$1 $5 Total 
Attitude change —.025 —45 —.08 
(20) (17) 


(37) 


Note.—Ns in parentheses. 


that the present model accounts for those data, 
but that dissonance theory does not account for 
the data in the present experiment. The present 
authors do not think that dissonance theory can 
be applied to this paradigm (with which Festinger 
agrees +). One may note, however, that if one 
wanted to contort dissonance theory to apply to 
the present experiment, it would predict results 
opposite to those obtained. That is, one might 
theorize that the greater the payment for voicing 
your own beliefs, the greater the dissonance be- 
tween payment and acknowledged importance of 
the beliefs, that is, the subject does not think 
his beliefs important enough to require such pay- 
ment for their verbalization. One presumably 
would reduce this dissonance by psychologically 
increasing the importance of the beliefs. In- 
creased importance should produce greater re- 
sistance to countercommunications. Thus, the 
derivation leads to the prediction that increased 
payment would produce increased resistance— 
precisely the opposite of what was obtained here. 

Undoubtedly, this is not the only way to con- 
tort dissonance theory to fit the present paradigm, 
but this particular contortion does bring up an 
important point. That is, differential payment 
may indeed have affected the perceived im- 
portance of either the act or the belief. We have 
no evidence to present on this point, other than 
the lack of difference between control conditions. 
We may merely note that if perceived importance 
were affected by the manipulations, it woul! 
serve to attenuate differences between conditions, 
rather than strengthen them. Thus this argument 
leads to the conclusion that greater control of 
the importance variable would have produced 
stronger results than we obtained. 

One last point should be made clear. The 
authors are not attempting to say that the model 
proposed here accounts for all of the dissonance 
theory data; indeed, it clearly does not. It does, 
however, account for all of those data in which 
there is an inconsistency between act and at- 
titude. At the same time, this narrowness of scope 


4L. Festinger, personal communication. 
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is precisely the limitation of the present model. 
The authors would argue, however, that the 
present model is an exploratory attempt to clarify 
a variable too long ignored by social psychology: 
"commitment. 
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EFFECTS OF UNCERTAINTY ABOUT THE NATURE AND 
ADVENT OF A NOXIOUS STIMULUS (SHOCK) UPON 
HEART RATE 


ROGERS ELLIOTT * . 
Dartmouth College 


Heart rate (HR) 


randomly selected male undergraduates while. they awaited a shock. A group 


shock would be mild, that it would be strong, Or knew nothing about it. The 


only significant effect 


ratings was associated with whether or not S knew anything about the nature 


of the shock to come. 


Nearly 40 years ago Skaggs (1926) investi- 
gated what he called variously *upsetness" and 
"excited expectancy" (terms he considered more 
neutral than emotion") by measuring the 
cardiac and respiratory effects upon subjects of 
awaiting "one or more severe shocks" whose 
nature was unknown and whose time of occur- 
rence uncertain. This condition eventuated in 
a heart rate (HR) acceleration of 10 beats per 
minute (bpm) among his male subjects during 
the minute following the instruction, as compared 
with the base rate recorded prior to it. Subjects 
tended to describe their feelings during the wait- 
ing period as “tense.” Skaggs did not manipulate 
levels of his two independent variables—knowl- 
edge of the nature of the shock and of the time 


1I wish to thank Gus Buchtel, Ray Peters, and 
Joseph Cardillo for getting the data, and the Fac- 
ulty Research Committee of Dartmouth for as- 
sistance in preparing this report. 


of its arrival—but Deane (1961) has recently 
done so. The HR acceleration relative to a basal 
rate was measured while his subjects spent 30 
seconds awaiting a mild shock. One group had 
had the experience of the shock, and the other 
had not; and half of the subjects in each group 
knew just when to expect shock, and the other 
half did not. There was a greater tonic accelera- 
tive HR change in the group which had not 
experienced the shock than in the group which 
had; but the knowledge-of-time variable had no 
effect. Dean suggested in his discussion that use 
of stronger shock might have produced changes 
different from the ones he found with mild shock. 

The theoretical importance of such observa- 
tions as Skagg's and Deane's lies in the fact 
that they pertain to an emotional response (HR 
acceleration) whose antecedent is some degree 
of uncertainty about the nature and advent of 
an important event. Because of this theoretical 
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importance, the present study was undertaken 
to extend Deane's design to cover the use of 
both strong and mild shock, and to replicate his 
results. It is possible that awaiting a shock which 
one has experienced as very strong will result 
in greater HR acceleration than will awaiting a 
shock of unknown value, a result contrary to 
what was found using mild shock. And it was 
surprising in Deane's results that knowledge of 
time of onset of shock had no effect upon HR 
acceleration, because knowing when a bad situa- 
tion is going to occur would seem to allow for 
less uncertainty and less emotional arousal than 
not knowing. 

Therefore, in the present factorial design, the 
independent variables are whether the subject 
has had experience or not with the shock he is 
to receive, and, if he has, whether it is a strong 
or mild shock; and whether he knows just when 
to expect the shock. The main dependent variable 
is change in HR from a rest period to the 
period awaiting shock; and a supplemental score 
is a verbal report of degree of tension, taken 
to assess the degree to which an overt verbal and 
covert autonomic response agree in indexing the 
arousal evoked by the various conditions. 


METHOD 
Subjects 


The subjects were 60 undergraduate males at 
Dartmouth College who participated for extra course 
credit in introductory psychology. They did not 
know before they arrived that shock would be used, 
having been told that it was an “experiment in 
expectation.” No subject refused, and the data of 
all subjects were used, except for one case of record- 
ing failure, where an extra subject had to be re- 
cruited as a replacement. 


Apparatus 


The experimental space was a sound and elec- 
trically shielded room, 8 X 12 feet. The Grass EKG 
electrodes were attached to the subject across the 
heart from the upper left clavicle to the lower right 
rib, with a ground lead from the leg. All recordings 
were made on a Grass Model 7 polygraph. The 
shock apparatus consisted of a 6-volt dc battery, an 
inductorium which converted the current to 15 volts 
ac, a resistance box, and shock electrodes. One of the 
shock electrodes was a conical metal sleeve into 
which the index finger could be inserted, and the 
other was a metal plate onto which the heel of the 
hand could be strapped. When in use both were 
covered with electrode jelly, as were the EKG elec- 
trodes. Settings of the resistances in the circuit were 
such that a mild current of about 0.8 milliampere or a 
strong current of about 4.0 milliamperes was deliv- 
ered to the subject’s hand. A memory drum was 
used to present the numbers “1” through “12” at 
the rate of one every 3 seconds. 
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Procedure 


Subjects were assigned randomly to conditions. On 
entering the experimental room, all subjects were 
told the nature of the EKG machine, and were 
assured that it was simply the same machine used 
in hospitals in taking careful physicals. After the 
EKG electrodes were placed, the subject was asked 
to relax as comfortably as possible for a few (ac- 
tually 3) minutes, so that the experimenter could 
determine a reference base HR. 

Following the rest period, the shock apparatus, 
which had been hidden behind a cardboard screen, 
was uncovered, and the subject was given the fol- 
lowing instruction: 


We are going to test HR reaction to pain using 
shock. The shock will be completely harmless. 
No damage to any tissue will occur, but a mildly 
painful shock will occur. We want a little pain 
because that is what we are studying. 


The subject’s hand was strapped to the metal 
plate, and his right index finger was daubed with 
jelly and inserted into the sleeve electrode. After this 
point the instructions varied depending on the con- 
dition to which the subject had been assigned, as 
follows: 


[Time known]: When the number in the memory 
drum reaches “12” you will be shocked. 

[Time unknown]: Sometime in the next 3 to 4 
minutes—maybe in a few seconds, maybe after 4 
minutes, maybe somewhere in between—you will 
receive a shock. Just when you get it depends 
upon a random, predetermined schedule. [In fact, 
shock was given on the number, “12.”] 

[Mild shock, known]: Before we begin to record 
again, ll give you a shock so that you'll know 
just what strength of shock will be delivered to 
you later. [The experimenter then gave the subject 
a mild shock.] 

[Strong shock, known]: Same as above, but the 
experimenter gave the subject a strong shock. 
[Unknown shock]: The experimenter simply said 
"Ready. No example of shock was given the 
subject. The actual shocks administered were 
mild for five subjects and strong for five subjects. 


Each of the two time conditions was paired with 
each of the three shock conditions. 

„Finally, 30 seconds after the shock had been 
given, the subject was asked to report on his ten- 
sion, if any. 


Now tell me, as best you can, the degree of your 
uneasiness while waiting for the shock. Some 
people dislike shock more than others, by reason 
aces or of constitutional sensitivity or 
oth. 


A 9-point scale, labeled “Degree of Comfort,” was 
then held up in front of the subject. Beneath Num- 
ber 1 was the phrase “very relaxed," beneath 5 was 
“moderately tense,” and beneath 9 was “very tense.” 
The very tense end was exemplified as being perhaps 
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TABLE 1 


Restinc Heart RATE, HR ACCELERATION IN ANTICIPATION OF SHOCK, CORRELATION BETWEEN RESTING AND 
ACCELERATION SCORES, TENSION RATINGS, AND CORRELATIONS BETWEEN TENSION RATINGS 
AND ACCELERATION SCORES: Basic DATA BY CONDITIONS 


Condition 
Measüteu Mild shock known | Strong shock known | ^ Unknown shock 
Time Time Time Time Time Time 
known | unknown | known | unknown | known | unknown 
1. Resting HR. (bpm) 76.9 81.4 744 70.4 84.8 73.6 
2. HR acceleration (bpm) 4.1 44 15 4.8 24.4 15.0 
3. Correlation, resting HR versus acceleration 43 —.39 —.09 416 — 40 .30 
4. Tension rating 47 41 48 49 6.2 49 
5. Correlation, tension rating versusacceleration | —.32 —.25 —.50 31 —.50 —.07 


Note.—N = 10 in each condition. 


like the feeling one has when cramming for a very 
important exam for which one has not studied at all. 
The experimenter then said, 


Now, just before the shock actually came, where 
would you judge yourself to have been on the 
scale? Please point to the spot. 


The subject was thanked, his credit card was 
signed, and he was asked not to say anything about 
the experiment so that others would not know what 
to expect any more than he did. 


Handling of Data 

The HR was measured simply by counting all the 
R waves in the last minute of the initial rest, the 
30-second period just prior to shock, and the 30 
seconds just after shock. The tension score was the 
number on the scale nearest to the place at which the 
subject first pointed. 


RESULTS 


The primary data appear in Table 1, and the 
relevant analyses of variance in Table 2. The 
. resting HRs varied from group to group in 
a distribution closely approximating Deane's 
(1961). The variation was not attributable to 
the experimental conditions (see Table 2). Row 
3 of Table 1 shows the within-group correlations 
between resting HR and HR change. The co- 
efficients average .01, there appears to be no 
consistent relation in their size or direction with 
any experimental variable, and no covariance 
analysis is indicated. Row 2 of Table 1 presents 
the HR acceleration scores, and their analysis 
is summarized in Table 2. It is very clear that 
the only variable significantly affecting these 
acceleration scores was whether or not the sub- 
Ject knew anything about the coming shock. The 
expectation of a strong shock caused no greater 
acceleration than the expectation of a mild one, 
and neither condition resulted in much accelera- 


tion; but not knowing what sort of shock to 
expect at all resulted in very considerable 
acceleration, averaging about 20 bpm. 

The tension scores are shown in Row 4 of 
Table 1 and their analysis is summarized in 
Table 2. Again, they were affected to a signifi- 
cant degree only as a function of whether the 
subject did or did not know the nature of the 
shock to come. Row 5 of Table 1 indicates the 
nature of the relation between the verbal and 
autonomic responses. None of the coefficients is 
significant, nor is their average (—.24). 

Though it did not make any difference in the 
overall amount of HR acceleration whether or 
not a subject knew when the shock was to come, 
the knowledge of time variable was effective 
in influencing the way in which HR changed 
within the 30-second anticipation period. The 
left-hand columns of Table 3 illustrate the 


TABLE 2 


SUMMARY OF ANALYSES OF VARIANCE OF RESTING HR, 
HR ACCELERATION, AND TENSION RATINGS 


Source df MS F 
Resting HR 
Knowledge of time (A) 1 306.0 143 
Knowledge of shock (B) | 2 190.8 | «1.00 
AXB 2 308.8 1.44 
Error 54 214.3 
HR acceleration 
A 1 42.9 | «1.00 
2 | 1791.6 14.34 
AXB 2 195.6 1.5 
Error 54 125.2 
Tensi tin 
TE jud 1 5.4 3.56 
2 m 4.42* 
AXB 2 2. 1.61 
Error 54 1.5 
-05. 
P s .001. 
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TABLE 3 


MEAN HR DURING THE THREE 10-SECOND PERIODS Prior TO SHOCK, 


AND MEAN AND MEDIAN OF CHANGE IN 


HR FROM FOURTH To First CARDIAC CYCLE PRECEDING Suock: DATA By CONDITIONS 


h fi 
HR (bpm) during waiting period Es cycle prior to. Hos 
Conditions 
First 10 seconds | Second 10 seconds | Third 10 seconds M Median 
Time known 

Mild shock known 80.5 81.1 81.3 — 6.00 —2.50 
Strong shock known 74.9 75.6 76.1 —2.02 —0.62 
Unknown shock 109.1 107.9 109.9 —1.10 —0.25 
Time known, combined 88.0 88.1 88.9 —3.04 —0.50 

Time unknown 
Mild shock known 85.3 84.4 87.5 —3.15 1.50 
Strong shock known 74.0 754 78.2 3.05 1.00 
Unknown shock 87.4 89.8 91.7 —1.40 0.75 
Time unknown, 822 83.2 85.8 —1.50 1.00 

combined 


Note.—N = 10 in each condition. 


change in HR in consecutive 10-second intervals, 
for each condition, Among those who knew when 
shock was to come, the increase in HR from the 
initial to the final 10 seconds of the waiting 
period was less than 1 bpm, and was not signifi- 
cant. Among those who did not know when 
shock was to come, there was a progressive and 
significant (p < .001) increase in HR totaling 
3.6 bpm. This differential effect of the knowl- 
edge-of-time variable appears in part attributable 
to the fact that among those who knew when 
the shock was to come, there was actually a 
deceleration in HR during the few cardiac cycles 
just prior to shock. The mean and median de- 
celerations in each group for the last four cardiac 
cycles are presented in the right-hand columns 
of Table 3. Of those 30 subjects who knew 
when shock was to come, 20 showed deceleration; 
of those 30 subjects who did not know, only 13 
showed deceleration in these last few beats. 


Discussion 


The present results replicate those of, Deane 
(1961) in every important respect: the tonic 
HR acceleration occurred in strength only among 
those who had not had any experience with the 
shock they were to encounter; and the knowl- 
edge of time variable was ineffective. In the 
present study the amount of acceleration in 
anticipation of any unknown shock was much 
greater than what Deane found; but the details 
of the present design and procedure were suf- 


ficiently different from his to account for such 
a difference. The results for strong shock were 
no different from those for mild shock, within 
the limits of the shock currents used. The finding, 
now repeated, that it makes no difference in the 
tonic, overall HR acceleration whether one knows 
when the shock is coming or not, is especially 
noteworthy because it deniés the validity of the 
intuitively reasonable notion that uncertainty 
about the time of occurrence of a noxious event 
ought to lead to some measurable degree of 
emotional arousal. 

The verbal reports of tension were affected in 
much the same way as the covert HR responses 
by the experimental conditions. But in this study, 
as in a previous one in which HR and an overt 
instrumental response were compared (Elliott, 
1965), the two responses were not highly 
correlated within the individual. 
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GROUP PERFORMANCE AS A FUNCTION OF TASK DIFFICULTY 
AND SIZE AND STRUCTURE OF GROUP: IL: 


JULIAN O. MORRISSETTE 


Behavioral Sciences Laboratory, Wright-Patterson Air Force Base, Ohio 


This paper is a sequel to Morrissette, 


Switzer, & Crannell (1965). Performance 


data on 3-man groups was obtained in wheel (W) and circle (C) structures, 


and levels of task difficulty, H = 1.6 


and H —24. These data are compared 


with those obtained on 4- and 5-man groups under identical conditions. 
10 groups were run under each condition, with each group given 15 problems 


to solve. Problem solution-time and 


error data were collected. The problem 


solution time data show the following: (a) in C structures, as group size 
increases performance deteriorates; (b) in W structures there is no relation- 
ship between group size and performance; (c) as group size decreases, the 
effect of structure on performance decreases. In the error data, only structure 


produced a significant effect, with W 


Morrissette, Switzer, and Crannell (1965) re- 
ported data obtained from a study of four- and 
five-man groups operating in wheel (W) and 
circle (C) structures under two levels of task 
difficulty, H = 1.6 (“easy”) and H = 2.4 (*diffi- 
cult"). Subsequently, data have been collected 
on three-man groups under conditions identical 
with those of the four- and five-man group 
study. These data will now be reported, along 
with the four-man group data of the previous 
study,” 


METHOD 
Subjects 


Subjects were Miami University men students who 
volunteered and who were paid for their partici- 
pation. Subjects were randomly assigned to experi- 
mental conditions. Ten groups were run in each 
condition, 


1This research was conducted in part at Miami 

University under Contract AF 33(657)-10456 with 
the Aerospace Medical Research Laboratories, 
Behavioral Sciences Laboratory, Wright-Patterson 
Air Force Base, Ohio. This paper is identified as 
AMRL Technical Report No. AMRL-TR-65-220. 
Further reproduction is authorized to satisfy needs 
of the United States Government. 
_ *1f the three-, four-, and five-man data were 
included in a single analysis of variance, the proba- 
bility of obtaining significant differences would be 
about 1.00, since previous analyses of the four- 
and five-man data produced significant differences. 
Further analyses would have been required to show 
what we have shown by considering only the three- 
and four-man data. However, in the Discussion we 
shall coordinate the data obtained from all the 
groups, 

3 Since the three- and four-man groups were run 
a year apart, making it impossible to randomly 
assign subjects to all conditions, some question may 
be raised regarding the propriety of analyzing such 


structures making fewer errors than C. 


Apparatus and. Task 


The apparatus used was essentially identical to 
that described by Leavitt (1951). The task is de- 
scribed in detail by  Morrissette, Pearson, and 
Switzer (1965) and summarized in Morrissette, 
Switzer, and Crannell (1965). 


Procedure 


The procedure used is given in Morrissette, Switzer, 
and Crannell (1965). We note, however, that 15 
problems were given to each group. As in the 
earlier study, the effects of learning were minimized 
by considering only the data from Problems 6-15. 
The group's solution-time score on each problem is 
the average of the time required by subjects in the 
group to obtain and transmit the correct answer to 
the experimenter. The maximum (poorest) score on 
each problem was 10 minutes, the time limit set on 
each problem. A group's error score is the total 
number of erroneous answers transmitted to the 
experimenter. 


RESULTS 
Solution-Time Data 


The mean solution times per problem are given 
“n Table 1, and the analysis of variance based 
on these data is shown in Table 2. All of the 
main effects are significant. Examining the means 
of the main effects, we find that: three-man 
groups (1.94 minutes) are faster than four-man 


data statistically. The procedure is proper #f it can 
be assumed (or shown) that the subjects were 
drawn from the same population. In the present 
case, all subjects were Miami University men stu- 
dents, and no basis exists to believe that this popu- 
lation changes substantially from year to year. If 
one prefers not to accept this assumption, then he 
may attribute the results to differences in subject 
population rather than to the experimental condi- 
tions. 
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TABLE 1 


Mean SoLUTION Times (IN MINUTES) PER PROBLEM 
FOR PROBLEMS 6-15 


Structure 
Group size as rere [serene 
(s 
16 1.61 1.55 
Three 24 221 2.38 
16 1.91 244 
Four 24 2.56 3.25 


(246 minutes); performance on the H=1.6 
task (1.80 minutes) is faster than on the H = 2.4 
(2.60 minutes); and the W (2.07 minutes) is 
faster than the C (2.33 minutes). 

None of the interactions attain an acceptable 
level of significance (p <.05). However, the 
AXC interaction approaches significance (p 
<.10). These data show that for the three-man 
groups the W and C means are 1.91 minutes and 
1.97 minutes, respectively, while for the four- 
man groups the W and C means are 2.24 minutes 
and 2.70 minutes, respectively. The difference 
between the W and C is greater in the four-man 
than in the three-man groups, and the difference 
between the three- and four-man groups is 
greater in the C than in the W. 


Error Data 


The mean errors per problem are given in 
Table 3, and the analysis of variance based on 
these data in Table 4. Only structure produced 
a significant effect (p<.05) while group size 
approached significance (p < .10). The main ef- 
fect means show that the W made .50 error per 
problem and the C, .78, and three-man groups 
made .51 error and four-man, .76. Though the 
differences on task did not approach significance, 
we note their error rates: 
H=24=.71. 


TABLE 2 
ANALYSIS OF VARIANCE PROBLEM SOLUTION Times 


Source df MS F 
Group size (A) 1 5.53 18.439 
Task (B) 1 12.83 42.776 
Structure (C) 1 1.31 437% 
AXB 1 45 — 
AXC 1 -80 2.67* 
BXC 1 58 1.93 
AXBXC 1 .06 — 
Error 72 30 — 

* «10. 
**5 « .05. 
sek <.001 


H= 16=.56, 
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TABLE 3 
MEAN ERRORS PER PROBLEM FOR PROBLEMS 6-15 


Structure 


Group size difivalty 
w c 
Three A $ | 3 
Four 24 A ‘oh 
DISCUSSION 


We shall first compare the present results 
with those reported for the four- and five-man 
group study, and then these results will be 
examined in the light of the Thomas and Fink 
(1963) literature review. 


Solution-Time Data 


In the four- and five-man study, group size 
did not produce a significant effect, as it did 
in the present analysis. However, the Group 
Size X Structure interaction was significant in 
the original study and it approached significance 
here, The general consistency of this latter result 
suggests the examination of this interaction for 
the three group sizes. These data are shown in 
Table 5, and they suggest the following general 
propositions: in C structures, as group size in- 
creases performance deteriorates; in W struc- 
tures there is no relationship between group size 
and performance; as size of group decreases, the 
effect of structure on performance decreases (to 
zero?). y 

Structure produced a significant main effect in 
both studies, but the data in Table 5 indicate 
that this result was obtained from the differences 
in W and C in the four- and five-man groups 
only. 

Task difficulty produced a significant main 
effect in both analyses, while in neither analysis 


TABLE 4 
ANALYSIS OF VARIANCE ERROR DATA 


Source df MS F 
Group size (A) 1 120.05 3.45* 
Task (B) 1 39.20 143 
Structure (C) 1 156.80 4,51e* 
AXB 1 18.05 = 
AXC 1 45 ET 
BXC 1 .00 D 
AXBXC 1 54.45 1.57 
Error 72 34.77 ed 

*p «.10. 
**$ «.05, 


SS 
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TABLE 5 


Mean SOLUTION Trmes (IN MINUTES) PER PROBLEM 
FOR THREE-, Four-, AND FivE-MAN GROUPS 


Structure 
Group size DE 
W, c 
Three 1.91 1.97 .06 
Four 224 2.70 46 
Five 2.19 3.10 91 


was the Task x Group Size interaction signifi- 
cant, indicating that performance on the H = 1.6 
task was faster than on the H = 2.4 task for all 
levels of group size. Similar results were obtained 
on the Task x Structure interaction—in neither 
analysis was the interaction significant, indicating 
that performance was faster on the H = 1.6 task 
than on the H=2.4 task in both W and C 
Structures. 


Error Data 


Only structure produced a significant effect in 
both analyses, with W making significantly fewer 
errors per problem than the C. It should be 
noted, however, that group size approached sig- 
nificance in the present analysis, with three-man 
groups making fewer errors than four-man, while 
in the original study, no indication of a group 
Size effect was found. 

Examining these results in the light of the 
Thomas and Fink (1963) paper, we find them 
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reporting that "measures of speed showed no 
difference or else favored the smaller groups." 
Our data are consistent with this finding. How- 
ever, from the data in Table 5, we may say 
that the smaller the group, the faster the per- 
formance in C structures, while in W structures 
there is no relationship between size of group 
and speed of performance. 

Considering quality of performance, Thomas 
and Fink report a positive relation "with group 
size under some conditions and under no condi- 
tions were smaller groups superior." If we take 
the error data as an indication of "quality of 
performance," then the present data are con- 
sistent with the Thomas and Fink conclusions, 
since we may say that “under no conditions were 
smaller groups [significantly] superior." 
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INDIVIDUAL PERCEPTUAL STYLES FOLLOWING INDUCED FAILURE 


DONALD R. BROWN 
University of Michigan 


3 groups of Ss were exposed to 
or neutral perceptual task followe 


failure, and instrumental stimuli. Ea 
induced failure group results were anal 
ences in threshold following failure jn accordance 
for perceptual defense. The results support 


Eriksen (1963) in a recent review of the per- 
Sonality-perception literature concludes that 


the evidence surveyed . . . comprises a convincing 
testimonial as to the genuineness of the perceptual 
defense phenomena. It demonstrates that when ex- 


AND 


R. JAMES YANDELL 
University of California, Berkeley 


either a level of aspiration failure, success, 
d by a threshold determination using goal, 
ch S served as his own control The 
lyzed to determine the individual differ- 


with Eriksen's 2nd criterion 
t the relevance of Eriksen's criterion. 


periments employ adequate precautions to insure 
that perceptual stimuli are indeed anxiety-arousing 
for the individual subjects and take into account 
the individual differences in defenses, defensive mech- 
anisms, clinically conceived, do reveal themselves in 
the perceptual recognition of stimuli [p. 50]. 
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In essence, Eriksen is setting up two criteria 
for the demonstration and exploration of motiva- 
tional effects on perceptual thresholds. First, that 
the relevance of the stimuli to the individual 
subject be independently demonstrated in the 
experimental context; and second, that threshold 
changes be consistent within individual subjects 
in accordance with independent knowledge of 
each subject's characteristic response to self-rele- 
vant stimuli. 

In view of Eriksen’s reopening of the “per- 
ceptual-defense” controversy it seems appropriate 
to report a further analysis of an experiment by 
Postman and Brown (1952) from the point of 
view of individual differences which satisfies both 
of these criteria. 


METHOD 
Experiment 


The experiment consisted of two procedures: es- 
tablishment and independent demonstration of a 
situational context of success or failure and meas- 
urement of perceptual sensitivity to symbols of suc- 
cess and failure. To establish a context of success or 
failure, the level of aspiration situation was used. 
The procedure was so arranged that some subjects 
attained or exceeded most of fheir stated goals in an 
experimental task while other subjects did so only 
rarely. Thus, contexts of success or failure could be 
created under standard conditions. The task was a 
measure of span of apprehension test. It consisted 
of a series of 15 slides on each of which a set of 12 
symbols (capital letters, small letters, and numbers) 
appeared. The spatial arrangement of the symbols 
varied considerably from slide to slide. Each slide 
was exposed tachistoscopically for 1 second, and 
the task was to report as many of the symbols as 
possible. The two experimental groups of subjects 
were instructed to state their expected achievement 
before each exposure in terms of percentile ranks 
supposedly computed from the performance of a 
sample of the subject’s peers. A control group went 
through the procedure as a straight perceptual task 
without stating their goals beforehand and without 
being given scores. Thus, members of the two ex- 
perimental groups announced their estimated scores 
prior to the exposure of each slide and were given 
their score after each exposure in percentile ranks, 
Before announcing the score, the experimenter went 
through the motions of looking up the subject’s 
standing in a table of simulated norms. 

The ranks announced to the members of the suc- 
cess group were such that the subjects exceeded their 
estimates on 11 of the 15 trials, The ranks announced 
to members of the failure group fell short of their 
estimates on 11 of the 15 trials. 

Immediately following the span of apprehension 
procedure, a series of 24 words was presented to all 
subjects for tachistoscopic recognition. There were 
12 critical, motivationally relevant words and 12 
control words. Of the 12 critical words, 4 were 
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“goal words” connoting success, 4 were "depriva- 
tion words" connoting failure, and 4 were “instru- 
mental words” connoting instrumental striving. Each 
critical word was matched for frequency by a neu- 
tral control word. The words were presented in 
random order typed in capital letters on the 2X2 
lantern slides, The slides were flashed on a ground- 
glass screen from a projector equipped with photo- 
graphic shutter. Two parallel lines, exposed by means 
of a second projector, provided the fixation area 
within which the stimuli appeared. Speed of ex- 
posure was kept constant at 1/100 second. The 
threshold of recognition was determined by increas- 
ing the brightness of the exposure. At the beginning 
of the procedure the approximate threshold of the 
subject was determined by means of a practice 
slide. Exposures of all subsequent slides were then 
begun about 15 steps below the estimated threshold 
and continued until correct recognition of the words. 


Subjects 


The subjects were male undergraduates and 
graduate students at the University of California 
who had volunteered to take part in the experiment. 
The members of the failure group, in whom we are 
particularly interested in this report, were tested 
under rather special conditions. These subjects were 
all advanced graduate students in the physical or 
social sciences who were undergoing a series of 
intensive tests and interviews during a 3-day assess- 
ment at the Institute of Persünality Assessment and 
Research at the University of California. Our ex- 
periment was one of the procedures which the sub- 
jects underwent during the assessment and it is with 
the personality data collected on them in other parts 
of the program with which this report deals. The 
procedure took 1 hour. There were 39 in the failure 
group and 14 each in the success and control groups. 


RESULTS 
Group 


Briefly, the group results were as hypothesized. 
The level of aspiration of the failure group 
steadily declined, whereas that of the success 
group showed a progressive increase. In addi- 
tion, the failure group attempted more responses 
and made a larger percentage of errors in the 
experimental task than did the success group oF 
the control group. Thus, both level of aspiration 
behavior and performance in the experimental 
task indicate that contexts of success and failure 
were created for the experimental groups. The 
failure group was significantly more sensitive to 
deprivation words than were the success group 
and the control group. Similarly, the success 
group was significantly sensitized to goal words. 
The pattern of thresholds for instrumental words 


1The authors express their gratitude to Donald 
W. MacKinnon, Director of the Institute of Per- 
sonality Assessment and Research, for his coopera- 
tion in gathering these data. 
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was similar to that for goal words but the differ- 
ences among groups fell short of statistical sig- 
nificance. The detailed results are to be found in 
Postman and Brown (1952). It would appear 
from these data that the area of selectivity of 
perception is, at least partly, a function of the 
immediate context of the stimuli. 

The group analysis, then, supports Eriksen’s 
first methodological criterion. Now, in view of 
Eriksen’s second criterion, the question arises as 
to whether the degree of sensitivity, as measured 
in this instance by lowered thresholds, is in any 
way related to the personality structure of the 
individual perceiver. In order to explore this 
question, the individual differences obtained in the 
failure group were studied in relation to the 
ratings given by the assessment staff at the In- 
stitute of Personality Assessment and Research 
after having lived with and studied the subjects 
in groups of 10 for 3 days and nights. These 
ratings were made independently by at least six 
judges without access to any of the data obtained 
by experimenters other than themselves. Also, an 
item analysis was made of the adjectives checked 
by the subjects as descriptive of themselves as 
well as the adjectives checked by the staff as 
descriptive of each subject on the Gough Adjec- 
tive Check List. 


Individual 


We find that the higher the threshold to goal 
words following failure correlates + .37 with staff 
ratings of "potential success in the professional 
field of the subject" and + .54 with ratings of 
“soundness as a person.” The former is significant 
at the .05 level while the latter is significant at 
the .01 level. Low thresholds to the deprivation 
words correlates + .42 with staff ratings of “orig- 
inality and likelihood of original and creative 
work in the chosen field.” Again this is significant 
at the .01 level, When we combined the degree 
of sensitivity to goal words and to deprivation 
words in a multiple correlation, we find all the 
multiple r’s obtained to be significant at the 01 
level. They are + .43 with “potential success," 
-- 45 with “originality,” and + .54 with “sound- 
hess as a person." Furthermore, high thresholds 
to goal words following failure were found to 
correlate at the .01 level with the MMPI K 
scale, staff ratings of positive character integra- 
tion, and staff ratings of adjustment. At the .05 
level this variable was found to correlate with 
the CPI Dominance scale, the CPI Responsibil- 
ity scale, and staff ratings of social relations and 
leadership. 

From the above it appears that the lower the 
threshold to failure stimuli following failure ex- 
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periences, the more likely a subject is to be seen 
as well adjusted and successful by a staff of 
psychologists and on various scored measures. An 
even more striking characterization of the dif- 
ferences between the high- and low-threshold 
subjects in the failure group is found in the 
Adjective Check List analysis. The subjects who 
showed high thresholds following failure to the 
goal words described themselves at the .07 level 
or better more often as confident, determined, 
good-natured, praising, original, unselfish, re- 
sponsible, and enthusiastic, On the other hand, 
the subjects with low thresholds to the goal words 
described themselves at the .05 level more often 
as pessimistic, aloof, bitter, complaining, and 
planful. 

The composite staff checks on the same variable 
describe the highs at the .05 level as pleasant, 
reliable, capable, cooperative, intelligent, sociable, 
civilized, tactful, dependable, helpful, fair-minded, 
reasonable, adaptable, easygoing, poised, self- 
controlled, and sentimental. The lows are de- 
scribed at the .05 level as self-centered, impa- 
tient, tense, bossy, changeable, preoccupied, 
wary, awkward, confused, emotional, high-strung, 
and nervous. d 

CONCLUSION 


It seems appropriate to point out again the 
obvious value of perceptual cognitive procedures 
in investigating personality. Not only do the 
above results have important implications for a 
general theory of perception,? but they might also 
be interpreted to mean that, contrary to widely 
held belief, perception is not usually defensive 
in the narrow repressive sense but rather in a 
unitary way contributes to the general effective- 
ness of the organism under conditions of stress. 
Furthermore, the degree to which this increased 
effectiveness is reflected in perception is signifi- 
cantly related to the general adjustment mode of 
the subject. 

The results when analyzed at the group and 
individual levels satisfy both of Eriksen's cri- 
teria and clarify the dependence of selective 
sensitivity on the association between situational 
cues and perceptual personality predispositions. 
As Postman and Brown (1952) put it, 


2The use of the term perception may be ob- 
jected to since it is impossible to independently 
characterize the results as a perceptual change and 
not merely as an altered response probability. The 
emphasis in these results is on the consistent relation 
between the intraindividual personality data and the 
perceptual task data. The reader may prefer “re- 
sponse change" if central constructs such as "per- 
ception" are offensive. 
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the selectivity of perception is not necessarily in- 
strumental to wish-fulfillment or tension reduction 
[in the usual Freudian sense]. In connection with 
the investigation of motivational and personality 
variables there is a temptation to think of perception 
as “autistic” ... serving the needs and wishes of 
the organism within the limits allowed by the 
stimulus conditions. Such may indeed be the case 
in some situations, but the assertion that perceptual 
selectivity reflects the motives and past experiences 
of the individual has higher generality . . . [p. 213]. 
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and indeed this turns out to be the case follow- 
ing failure and furthermore, the degree of such 
selectivity is significantly related to overall per- 
sonality adjustment in our sample. : 


REFERENCES 


Ersen, C. W. Perception and personality. In J. 
M. Wepman & R. W. Heine (Eds.), Concepts of 
personality. Chicago, Ill: Aldine, 1963. Ch. 2. 

Postman, L., & Brown, D. R. The perceptual conse- 
quences of success and failure. Journal of Ab- 


normal and Social Psychology, 1952, 47, 213-221, 
(Received September 24, 1964) 


The possibility must be left open that perceptual 
selectivity may favor negatively valued objects 
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3 problems have precluded clear findings in the authoritarianism-conformity 
area: the existence of an agreement response set in the commonly used F Scale, 


the failure to demonstrate consistency 
reliance on linear correlation analysis. 


in conformity behavior, and an excessive 
"The present study attempts to overcome 


these problems. The principal findings show that consistently high conformers 


(N = 20) score higher (p € .01) on a 
do consistently low conformers (N — 


forced-choice version of the F Scale than 
25). No sex difference was observed. It 


seems likely that connotations of rigidity and narrow-mindedness go with the 
authoritarian syndrome, and that these relate also to the personality of the 
high conformer. It is pointed out that other personality correlates of con- 
formity, not tapped by the F Scale, have been reported. 


There has been a remarkable lack of agreement 
in psychological literature concerning the nature 
of the relationship between authoritarianism and 
conformity. Some writers (Beloff, 1958; Crutch- 
field, 1955; Nadler, 1959; Wells, Weinert, & 
Rubel, 1956) have reported a positive association 
between conformity and scores on the California 
F Scale. In other instances, this significant link 
was observed only when an interaction between 
authoritarianism and degree of interpersonal con- 
fidence was taken into account (Berkowitz & 
Lundy, 1956), when confederate subjects ex- 
pressed complete unanimity (Steiner & Johnson, 
1963), or when prestige rather than nonprestige 
suggestibility was involved (Millon & Simkins, 
1957). Other writers (Gorfein, 1961; Hardy, 
1957; Weiner & McGinnies, 1961) have failed 
to find any relationship between F scores and 
conformity. 

Three problems prevent an adequate assess- 
ment of these findings. The first is one that has 
gained general recognition: the F Scale is a de- 
vice which allows an agreement response set bias 
to operate. It has been estimated (Chapman & 
Bock, 1958; Quinn & Lichtenstein, 1962) that 
up to 40% of the variance of the F Scale is at- 
tributable to a factor of “acquiescence.” The 
implication, which has already been noted by 
Beloff (1958) and by Small and Campbell (1960), 
is that even if F scores are found to correlate 
Positively with a selected conformity measure, 
no further light has been cast on the personality 
of the conformer. Rather, “acquiescence-inducing 
Situations and the orthodox F scale just measure 
the same thing [Beloff, 1958, p. 103].” 

A second problem is that researchers in this 
area have usually employed a solitary measure of 
Conformity, making the unjustifiable assumption 
that they have thereby tapped an underlying trait 


of behavior. In fact, consistency which would 
warrant the use of a term such as “trait” has 
been found in only about 20% of a population of 
female students (Vaughan, 1964). 

A third problem has been an excessive reliance 
upon linear correlation analysis of the scores of 
all subjects in a given population, when the ex- 
perimental effect may exist only in restricted 
portions of that population. This problem, and the 
previous one, are discussed elsewhere (Vaughan, 
1964). Neither are restricted in relevance to 
studies of authoritarianism and/or conformity. 
Generally, it could be said that these points are 
of critical importance in evaluating experimental 
studies dealing with personality correlates, par- 
ticularly where disagreements between researchers 
flourish. 

The present study represents a further attempt 
to test for the existence of a relationship between 
F scores and conformity. It differs from related 
investigations in that conformity in the trans- 
situational sense is emphasized, and a form of the 
F Scale is used from which the confounding 
variable of acquiescence has been eliminated. 


METHOD 
Subjects 


From an original pool of 231 male and 145 female 
first-year psychology students, subjects were elimi- 
nated who failed to complete all experimental treat- 
ments, who guessed the experimental implications 
(assessed in a postexperimental interview), or who 
were of non-European descent. The Ns were ac- 
cordingly reduced to 194 and 118 for males and 
females, respectively. From these, groups of con- 
sistently high and consistently low conforming 
groups were selected in terms of criteria referred to 
below. The high conformity (HC) group consisted 
of 10 males and 10 females, and the low conformity 
(LC) group of 16 males and 9 females. 
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TABLE 1 
Mean FCF Scores or HC anp LC Groups 


Male Female 
HC 324 3.20 
LC 2.57 2.69 


Conformity Measures 


Three measures of conformity used previously 
(Vaughan, 1964) were employed in the present in- 
vestigation. Chosen for their relative situational dis- 
creteness, they included group pressure (GP), nor- 
mative pressure (NP), and social acquiescence (SA). 
GP refers to a situation involving an artificially in- 
teractive group, in which subjects make a perceptual 
judgment communicable by means of visual signaling 
apparatus (cf. Crutchfield, 1953), NP comprises the 
manipulation of artificial norms to induce opinion 
shift in individual subjects (cf. Wiener, Carpenter, & 
Carpenter, 1956, 1957). SA denotes the Bass (1956) 
Social Acquiescence Scale, a questionnaire which has 
been found to discriminate between high and low 
scores on an acquiescence dimension. Further detail 
concerning these measures is available (Vaughan, 
1964). 

The criteria used for selecting HC and LC groups 
were upper and lower quartiles for both GP and NP, 
and a median cut for SA. The issue at stake here is 
that transsituational consistency in conformity be- 
havior must be demonstrated where possible subtle 
correlates of this trait are being investigated. 


Authoritarian Measure 


The instrument used was Berkowitz and Wolkon's 
(1964) forced-choice version of the F Scale (FCF). 
In this form, each positively phrased F statement is 
paired with a negative counterpart. Half of the F+ 
items appear on the left of their respective pairings 
under the label “A,” and half on the right under 
“B”; and vice versa for the F— items. The respondent 
selects either A or B from each pair and also indi- 
cates the extent of his agreement with his choice. 
‘The authors point out that, as all answer categories 
involve agreement, the design forces agreement with 
either F+ or F— items, and acquiescent response set 
is therefore eliminated. Berkowitz and Wolkon feel 
that a positional response set may still intrude, but 
the important point for the present research is that 


TABLE 2 
ANALYSIS OF VARIANCE OF DATA IN TABLE Í 


Source df MS F 
Conformity (A) 1 3.75 9.62* 
Sex (B) 1 0.00 
AXB 1 0.01 
Error 4 0.39 

*p <.01. 


.1963; Crutchfield, 1955; Nakamura, 1958; 
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TABLE 3 


CORRELATION AND REGRESSION COEFFICIENTS AMONG 
CONFORMITY AND AUTHORITARIAN MEASURES 


GP SA FCF B p. 
NP AT* .19* .21* 14 02 
GP 02 AS* 12 01 
SA .28* 25 06 
*p «.0t. 


their study does seem to indicate that acquiescence 
as such no longer plays a part in the resultant scores. 

Berkowitz and Wolkon chose one set of F— items 
from Bass (1955) and another from Christie, Havel, 
and Seidenberg (1958). The form adopted in the 
present investigation used the F— set from the second 
of these sources. 


RESULTS AND DISCUSSION 


The mean FCF scores of the HC and LC 
groups, male and female, are given in Table 1; 
and an analysis of variance of these data, using 
an unweighted-means solution, in Table 2. 

Tt can be seen from these results that HC sub- 
jects score significantly higher on the FCF scale 
than do LC subjects. No effects are observed re- 
lating to sex of subject, or to an interaction be- 
tween sex and conformity. 

To test for the effectiveness of using criterion 
groups, as above, product-moment correlations 
between all measures were calculated, based on 
the original N of 312 (male and female subjects 
combined). These results, together with a regres- 
sion analysis referred to below, are shown in 
Table 3. 

It can be seen from these results that five out 
of six intercorrelations for the conformity and 
authoritarian measures are significant. In all 
cases, however, the r’s are small, the largest being 
28 observed with respect to the SA and FCF 
scales. The correlations between the three con- 
formity measures are very small, and indicate 
that any assumption of linear relationships be- 
tween them would be a tenuous one. It has been 
shown (Vaughan, 1964), however, that low cor- 
relations do not necessarily prevent the isolation 

1 There is conclusive evidence (Allen & Crutchfield, 
Tudden- 
ham, 1958, 1961) that males conform less than fe- 
males in a group pressure situation. To some extent, 
the present results do suggest that distinct population 
strata are involved, in that the N of 16 for the male 
LC group is disproportionately high in relation to the 
three remaining groups. This would make a least- 
squares solution the appropriate analysis. The latter 
was carried out and yielded essentially the same re- 
sults, except that an even larger F of 17.64 was ob- 
served in the case of the main effect of conformity- 
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of a restricted number of consistent conformers 
and nonconformers. The results in Table 3 also 
show that each of the conformity measures cor- 
relates to a limited extent with the FCF scale. 
From the matrix of six coefficients given in the 
left half of the table, a multiple correlation 
analysis was carried out. From this, regression 
coefficients were calculated which take into ac- 
count the standard score form of the regression 
equation (necessitated by scale unit differences 
between the conformity measures), These coef- 
ficients are shown in the 8 column of Table 3, 
and their squares which provide an estimate of 
the contributions of the conformity measures to 
the variance of the FCF scores in the column 
B?. It can be seen from these data that SA, NP, 
and GP contribute 6, 2, and 1%, respectively, to 
the variance of FCF scores. While SA shows up 
as being relatively more important than the other 
measures in this analysis, the absolute contribu- 
tions of all three towards the authoritarian meas- 
ure would seem to be slight when the whole 
range of the present population is used. 

A final assessment of the predictability of the 
individual conformity measures was effected as 
follows: First, subjects were divided on each 
measure into two groups using a median cut, and 
compared in terms of FCF scores likewise divided 
at the median. The percentage correctly “pre- 
dicted” on the authoritarian measure by SA, NP, 
and GP in turn was 61.38, 59.44, and 56.39. 
Second, upper and lower quartile groups on each 
conformity measure were compared in terms of 
FCF scores, again divided at the median. The 
percentages correct were correspondingly in- 
creased to 70.00, 61.03, and 60.71. Third, the 
transsituational criterion groups were treated in 
the same manner, and the percentage correctly 
predicted was found to be 88.89. Clearly, the 
trend in these results is an increase in the ef- 
ficiency of prediction on the FCF scale as one 
moves away from the middle range of scores on 
the three conformity measures; and for marked 
predictive efficiency when only consistent con- 
formers and nonconformers are considered. 

In returning to the main results in Table 1, we 
can conclude that the FCF scale is tapping per- 
sonal characteristics, in criterion groups of con- 
formers and nonconformers, which are distinct 
from a tendency to merely acquiesce to scale 
style. It is worth noting that Crutchfield (1955) 
found that staff ratings on amount of authoritarian 
behavior, manifested in a standard psychodrama 
situation, correlated .35 with conformity. Again, 
Canning and Baker (1959), using high and low 
Scores on a Religious Authoritarianism scale, re- 
ported that authoritarians were more susceptible 
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to group influence in the autokinetic situation. 
Vaughan (1964) furthermore, found a positive 
association between conformity and a variable 
of authority suggestion derived from Cattell’s 
(1957) Objective Analytic Test Battery. These 
findings can be regarded as corroborative to the 
present results to the extent that they are not 
tied to the orthodox F Scale. Canning and Baker's 
study is limited in that one conformity measure 
was used, and Crutchfield's in that consistency 
in conformity was demonstrated within a situa- 
tion (artificially interactive group) but not be- 
tween situations. 

The question not answered by the present 
study is what the remaining variance of the F 
Scale is measuring after acquiescent response set 
has been eliminated (as in the FCF scale). 
Rokeach (1960, 1961) has distinguished between 
open and closed belief systems, which are defined 
in terms of “the extent to which the person can 
receive, evaluate and act on relevant information 
received from the outside on its own intrinsic 
merits, unencumbered by irrelevant factors . . . 
[1960, p. 57]." He claims that his Dogmatism 
Scale is sensitive to a person's relative position 
on an open-closed system dimension, and that it 
is a measure of “general” authoritarianism, while 
the F Scale measures only “rightist” authoritar- 
janism, At all events, this lead is promising and 
receives support in a study by Kelman and Bar- 
clay (1963), who interpret the F Scale as a 
measure of perspective or range of tolerance. The 
scale reflects “a person's psychological capacity 
for shifting contexts and accepting differences, 
and the opportunities for widening his experi- 
ences provided by his environment [p. 608].” 

"This indeed implies that seemingly overworked 
connotations such as rigid versus adaptable and 
marrow versus broad are at the root of the au- 
thoritarian personality syndrome. It should be 
emphasized, however, that authoritarianism is by 
no means the only personality dimension which 
relates to conformity behavior. Measures of in- 
telligence, assertiveness, anxiety, neural reserves, 
extraversion, realism, and certain value orienta- 
tions, have been found to discriminate in female 
subjects between high and low conformers 
(Vaughan, 1964). The present findings add au- 
thoritarianism to this list, for both male and 
female subjects, and suggest that consistent con- 
formers might be described as relatively rigid and 
narrow-minded. 
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60 middle-class married couples: (a) that 


the spouses’ assumed agreement exceeds their actual agreement, (b) that 
assumed agreement is positively associated with marital satisfaction, (c) that 


actual agreement is correlated with 


marital attraction only in areas where 


agreement is instrumental for promoting the pair's goals. Ss responded to a 
variety of marriage-relevant items; they ranked 2 sets of marriage goals, and 
described both partners' real and ideal behavior. The 1st 2 hypotheses were 


supportet 
explaining results from other studies. 


Byrne and Blaylock (1963) recently re- 
ported that a sample of husbands and wives 
tended to be similar in certain important at- 
titudes, but that assumed similarity between 
two spouses was significantly higher than 
actual similarity. They explained this result 
in terms of Newcomb's (1953) A-B-X model; 
that is, cognitive symmetry between husband 
and wife is easier to attain through mispercep- 
tion than through “any alternate process,” 
such as objective change in one’s actual atti- 
tudes, Byrne and Blaylock went on to suggest 
“Tt is possible that the magnitude and the 
direction of correlations between self-scores 
and assumed spouse scores . . . provides an 
index of marital satisfaction [p. 639].” 

The present paper corroborates Byrne and 
Blaylock’s results in a nonstudent sample, and 
then tests their suggestion that assumed 
agreement is an index of marital satisfaction. 
Finally, it examines the conditions under 
which actual agreement is positively related to 
interpersonal attraction. 

Hypothesis 1 is that the spouses’ assumed 
agreement exceeds their actual agreement 
about the importance of marriage-relevant 
topics. This hypothesis corresponds to that of 
Byrne and Blaylock (1963, p. 637). It de- 


1The data were collected under Grant MH-04653 
from the United States Public Health Service. An 
early version of the paper was read at the annual 
meeting of the American Psychological Association 
in Philadelphia, 1963. Clinton Fink, J. R. P. French, 
Jr, and George H. Wolkon made helpful comments 
on a previous draft. 

2 Now at the University of Massachusetts. 


d; the 3rd was not clearly confirmed, but it appears applicable for 


rives from the assumption that married part- 
ners have predominantly positive feelings for 
each other, and that their striving for cogni- 
tive symmetry leads them more often to un- 
derestimate than to overestimate their dif- 
ferences. 

Hypothesis 2 states that assumed agree- 
ment varies directly with marital satisfaction. 
This hypothesis stems from the same reason- 
ing as Hypothesis 1, but it explicates the con- 
nection between assumed agreement and at- 
traction. While numerous previous studies 
have demonstrated the relation between as- 
sumed similarity or agreement and attraction 
in peer groups (eg, Byrne, 1961; Fiedler, 
1954; Fiedler, Warrington, & Blaisdell, 1952; 
Newcomb, 1961; Smith, 1958), there has been 
little such evidence pertaining to marriage 
pairs. For example, Wallin and Clark (1958) 
tested the hypothesis in regard to spouses' 
preferred frequency of coitus; they found that 
assumed similarity and marital satisfaction 
were clearly related for husbands, but only 
slightly so for wives. 

To what extent is attraction related to 
actual agreement? Various studies have re- 
ported that friends are more similar than 
nonfriends on a variety of issues (Bonney, 
1946; Newcomb, 1961; Precker, 1952; Rich- 
ardson, 1940), but the data are far less clear 
for married or courting couples. For example, 
Rollins (1961) found no association between 
marital attraction and consensus about a set 
of values, though Elliott (1960) reported a 
positive correlation for marital roles. Kerck- 
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hoff and Davis (1962) found that value agree- 
ment related to progress in courtship, but not 
for couples who had known each other for at 
least 18 months, Farber (1957) implied that 
value agreement was positively correlated with 
marital satisfaction; however, Breedlove's 
reanalysis of his data from another study 
(Farber, 1960) found an average r of only 
07. 

The limits of the association between at- 
traction and actual agreement are only 
vaguely understood. Schachter (1951) and 
Newcomb (1961) each have proposed that 
the relevance of the consensual object will 
condition the strength of the pressures toward 
agreement. Yet Schachter (1951, p. 191), 
whose study confirmed this supposition, ad- 
mitted unclarity in his definition of relevance. 
In contrast, Newcomb (1961) gave a defini- 
tion—that *common relevance" is “joint de- 
pendence of A and B upon object X [p. 13]" 
—but he did not empirically test the effect of 
relevance, 

It is reasonable to suppose that relevance 
does indeed influence the relation between 
agreement and attraction, but one needs a 
principle that indicates relevance. Also, rele- 
vance alone is not a sufficient determinant— 
agreement about moral philosophy may be 
relevant, but it has only vague implications 
for two partners’ satisfaction with their rela- 
tionship. 

We propose that there is a positive correla- 
tion between actual agreement and attraction, 
but only to the extent that such agreement 
promotes the achievement of the group's goals. 
The more that agreement is instrumental for 
furthering the goals of the marital relation- 
ship, the higher should be the correlation with 
marital satisfaction. In a similar vein, Byrne 
and Blaylock (1963) proposed that agreement 
would be expected “to influence attraction 
only if [it results] in positive or negative re- 
inforcements [p. 639].” 

This qualification may be useful. It can 
help to explain some superficially contradic- 
tory results. For example, Elliott (1960) 
found a positive correlation between marital 
satisfaction and agreement about marital 
roles, but Rollins (1961) found no correlation 
between such satisfaction and agreement 
about general marital values. Although roles 
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and values are both relevant to a marriage, 
agreement about the latter seems less instru- 
mental for furthering the couple’s interaction, 
One can also explain how it happened that 
value consensus predicted positively to court- 
ship progress for short-term dating couples, 
but not for long-term dating couples (Kerck- 
hoff & Davis, 1962); and that value con- 
sensus did not correlate with marital happi- 
ness among Farber’s married couples. We 
postulate that agreement about such general 
values has high instrumentality during the 
early phases of courtship when two persons 
are getting acquainted; in relationships that 
have survived the preliminary screening 
phases, such agreement would become less and 
less instrumental, while agreement about more 
immediate topics becomes more so. 

Hypothesis 3, then, is that the association 
between actual agreement and attraction is 
greater for objects with high than for objects 
with low instrumentality of agreement. 


METHOD 


The subjects were 60 marricd couples, who had 
volunteered to take part in a research project on 
husband-wife relationships. Twenty-four couples 
were clients at a family service agency; the other 36 
couples were parents of children at an elementary 
school, selected to correspond to the counseling 
clients in major social characteristics but presumably 
differing in marital attraction. The couples had been 
married between 4 and 22 years, and all had chil- 
dren. The average couple was in its late thirties, 
had been married 13.6 years, had three children, 
and had a middle-class occupational and educational 
background. 

During a lengthy interview session, conducted 
with husband and wife in separate rooms, each re- 
spondent gave a large amount of information about 
his marriage. Attitudes toward family life were 
measured in several ways: ranking the relative im- 
portance of 9 diferent marriage goals, ranking the 
relative importance of 11 marital communication 
topics, responding to the short form of Levinson and 
Huffman's (1955) Traditional Family Ideology Scale. 
Subjects also described the couple's real and ideal 
behavior in several areas of marriage: task perform- 
ance, decision making, frequency of communication, 
social-emotional supportiveness, and frequency of 
sexual relations. 

Actual agreement was measured on each of the 
above indices, Assumed agreement was measurable 
on only the first two instruments. 

Marital satisfaction was measured by an index 
composed of each respondent's factor score derived 
from 15 subsidiary indices of marital satisfaction. 
This index was loaded more heavily with social- 
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TABLE 1 
ACTUAL VERSUS ASSUMED AGREEMENT ABOUT MARITAL GOALS 


Agreement: mean rhos 


Significance of differences 


Topics 


b (by sign test)* 


(a) Actually found | (b) Assumed by (c) Assumed by 
for couples husband wife 
a versus b a versus c 
Communication goals .39 (.29)5 .62 (.34) .68 (.23) 01 01 
General goals 44 (.32) .60 (.34) 46 (.36) .01 ns 


a In testing the significance of the differences, pairs of individual rhos were compared for each of the 60 spouses. For example, 
for communication goals, 47 out of 60 husbands assumed higher agreement than actually existed, and 12 husbands assumed lower 


than actual agreement. 


b Numbers in parentheses refer to standard deviations of the rhos. 


emotional than with task-oriented sources of satis- 
faction (see Levinger, 1964). 

The correlation between husbands’ and wives’ 
marital satisfaction scores was 45 (p < 01); this r 
is somewhat lower than the .59 correlation obtained 
by Terman (1938) in his study of marital happiness. 
In our sample, the husband and wife groups did not 
differ significantly in their mean satisfaction scores 
(L— 41). For husbands, M —16442, o= 6246; 
for wives, M = 168.97, ø= 52.13. However, the 
agency couples did have a clearly lower mean satis- 
faction score than did the school couples (t= 6.08, 
$ < 001), confirming the validity of the index. 


‘RESULTS 


The sample showed a moderate but signifi- 
cant correlation in the spouses’ scores on the 
test of Traditional Family Ideology (r =.32, 
2 «.01). The magnitude of this coefficient 
corresponds to that of the correlations ob- 
tained by Byrne and Blaylock (1963, Table 
2) for political attitudes, which ranged in 
size from .30 to .44. 

Table 1 shows the mean rhos obtained be- 
tween spouses for both actual and assumed 
agreement. Each of the mean rhos for as- 
sumed agreement was higher than the re- 
spective correlation for actual agreement, and 
three out of four were significant beyond the 
1% level. These results are parallel to those 
of the Byrne-Blaylock study. They constitute 
support for Hypothesis 1. 

These findings, as well as those to be pre- 
sented subsequently, hinge largely on the 
analysis of dyadic relationships, Cronbach 
(1958) has pointed out that dyadic analysis 
may be “a breeding ground for artifacts.” The 
results for our goal measures in Tables 1-3 all 
depend on rank-order scores, where the re- 
spondents were forced to use the entire set of 
tanks, This procedure avoids many of Cron- 


bach’s criticisms (and the reliability was .86 
in selected retests). Neither actual nor as- 
sumed agreement can be artificially increased 
by fixating on any given response categories, 
because the respondent must distribute his 
ranks broadly. Furthermore, actual agree- 
ment within random pairs was significantly 
lower (p <.05) than within spouse pairs. 

Table 2 indicates that assumed agreement 
was indeed correlated positively with marital 
satisfaction. Particularly for the husbands, 
there was a positive association between these 
variables, The effect was pronounced for hus- 
bands below the median in their accuracy of 
understanding the wife's preferences. "These 
low accuracy husbands showed an average 
correlation of .64 between perceived agree- 
ment and marital satisfaction, On the other 
hand, the wives in the sample, and the hus- 
bands above the median in accuracy, had 
lower average correlations which ranged be- 
tween .25 and .30 (Breedlove, 1962). Thus, 
Hypothesis 2 receives at least moderate sup- 
port. 

Hypothesis 3 proposed that actual agree- 
ment and attraction would be more highly 
correlated for objects of high instrumentality 
than for those with low instrumentality, In 


TABLE 2 
CORRELATIONS BETWEEN MARITAL SATISFACTION 


AND ASSUMED AGREEMENT 
Topics Husbands Wives 
Communication goals 39er .19* 
General goals ASH 29x 
Note.—Allsignificance tests were one-tailed. 
*p «.08. 
wD «.02. 
HD < 01. 
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TABLE 3 


CORRELATIONS BETWEEN MARITAL SATISFACTION 
AND ACTUAL AGREEMENT 


Marital satisfaction 
Topics of agreement — 
Husbands Wives 
High instrumentality 
Communication goals 19 18 
Real role performance* 16 12 
Low instrumentality 
General goals —.03 .02 
Ideal role performance —.01 —.13 


a The correlations for role performance represent mean r's, 
across six different areas of the spouse relationship. 


our judgment, agreement about the relative 
importance of specific communication topics 
is more instrumental than that about the im- 
portance of general goals for the marriage and 
the family. The former refer to particular 
behavior on which husband and wife are inter- 
dependent—marital communication depends 
on both; the latter refer to the achievement of 
general vague objectives, each desirable but of 
low operational significance. It is also judged 
that agreement about real role performance 
in the marriage has greater instrumentality 
than agreement about ideal role performance. 
Disagreement about a partner’s ongoing be- 
havior should hinder joint performance more 
than disagreement about images of desired 
behavior. 

The correlations in Table 3 tend to give 
support to Hypothesis 3, even though none of 
the coefficients reaches the 5% level of sig- 
nificance, Agreement about the relative im- 
portance of things husbands and wives must 
talk about together is more highly related to 
marital satisfaction than agreement about 
what are the most important marital goals. 
Similarly, marital satisfaction has more posi- 
tive correlations with agreement about the 
spouses’ real marital behavior than with agree- 
ment about ideal behavior. Hypothesis 3 is 
not adequately confirmed by the present evi- 
dence, but at least the results point in the 
right direction. 


DISCUSSION 


The thesis of this paper is essentially that 
interpersonal attraction and agreement are 
correlated variables, but that the limitations 
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of this association have yet to be specified. 
clearly. Previous work has not made an exs 
plicit distinction between objects for which 
agreement serves a central function versus 
those where it serves merely a peripheral, 
ancillary function. So far, the major focus has 
been only to distinguish between actual versus 
assumed agreement. E 

It was found that marital satisfaction 
significantly associated with the degree 
which a partner overestimates—or underesti- 
mates—in stating his assumed agreement wi 
his spouse. A number of spouses, whose mari 
tal satisfaction was low, reported even less 
agreement with their partner than was a 
tually the case. One can infer that, in accord: 
ance with Newcomb's model, these distortions 
are rewarding to these individuals. Byrne am il 
Blaylock (1963) speculated that “assumed 
similarity should occur only when two indi- 
viduals feel positively toward one another; 
partners experiencing marital discord or con- 
templating divorce should not respond in 
way [p. 639].” The results substantiate 
notion.® j 

The findings concerning Hypothesis 3 v 
rather weak, The data leaned in the predicted 
direction, but marital attraction and actua 
agreement showed a generally low association. 
One reason for this weak support derives from 
the difficulty of selecting appropriate topics. 
about which marital partners can indicate 
their agreement, A second may be that agri 
ment about items embodied in paper-ano- 
pencil questionnaires does not reflect the un 
derlying extent of spouses’ agreement. Future. 
studies of this issue must include consensus 
objects that represent a wider range of agr 
ment instrumentality. Nevertheless, we 
lieve that the present data are sufficie 
promising to warrant further exploration | 
the effect of instrumentality on the rela! 
between interpersonal attraction and actu 
agreement. 

In fact, it was possible to conduct such 
exploration by turning to a major study 
attraction and agreement, dealing with 
relationships among male college dormi! 
mates, Chapter 5 in Newcomb's (1961) b 

3Other unpublished research by the first al 
has consistently indicated a positive association. 
iween assumed agreement and marital satisfaction 
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on the acquaintance process is entirely devoted 
.to the connection between actual agreement 
and mutual attraction in *two person collec- 
tive systems." It was possible, therefore, to 
'examine the findings reported there to see 
whether the association would be most signifi- 
cant for objects with the highest instrumen- 
tality of agreement. 

Newcomb, himself, did mot distinguish 
among the different objects on such a dimen- 
sion, Agreement, in his book, concerns three 
topics: Housemates A's and B's liking for 
other housemates, their attitudes concerning 
miscellaneous public and private affairs, and 
their endorsement of a set of Spranger values. 

There was unquestionably strong support 
(P <.001) for the relation between mutual 
pair attraction and agreement about the at- 
tractiveness of other house members. The cor- 
relations attain significance after only a short 
period of dormitory residence. Such agree- 
ment would seem to be highly instrumental 
for the functioning of the A-B pair, since it 
would determine what other persons they 
would go around with, what to say to each 
other when other housemates were present, 
and the like, 

Agreement on Newcomb's attitude inven- 
tory or his Spranger values, while of relevance 
for the functioning of pair relationships among 
housemates, is not as immediately instru- 
mental for their mutual reward system. It is 
not surprising, therefore, to discover that the 
evidence here was much weaker. Newcomb's 
report does not clearly demonstrate a signifi- 
cant relation between pair attraction and 
agreement about the inventory items or the 
Spranger values. If he had dichotomized level 
of agreement at the median, instead of at 
varying levels of “high” versus “not high,” 
the pertinent chi-squares would have been less 
significant. It would appear, then, that his 
subjects’ agreement about such paper-and- 
pencil items was less rewarding, and disagree- 
ment was less punishing, than mutual agree- 
ment about one’s liking of other dormitory 
mates, Consensus about other friends would 


4In Table 5.5 (Newcomb, 1961), “not-high” 
Means agreement of less than .97 (Year I) and less 
than .90 (Year II). In Tables 5.6, 5.7, 5.8, and 3.9 
the criteria for "not high" agreement are different 
each time, and never at the median. 
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have immediate reinforcement value for a 
pair relationship among housemates, while 
agreement about general attitudes and values 
is a more peripheral matter. 

Newcomb (1961, Ch. 11) and his colleagues 
also attempted to predict attraction between 
roommates from their separate replies to the 
attitude inventory that was completed before 
arriving on campus. He reports, however, 
that “we failed, completely, to find support 
for the prediction [p. 216].” “Roommate” is 
an even more intimate relationship than 
“housemate,” and it appears that agreement 
on these miscellaneous attitude items had 
decreasing instrumentality with increasing 
intimacy. This finding seems analogous to 
that of Kerckhoff and Davis (1962) described 
earlier, who found a decreasing association 
between value consensus and intimacy, as a 
function of length of pair acquaintance. 

One other study is relevant. Broxton 
(1963), working from a conceptual framework 
derived from Newcomb did report a significant 
association between roommate attraction and 
actual agreement among college girls. One 
would ask: how instrumental was agreement 
about Broxton's items? Her questions referred 
to agreement about trait descriptions of each 
roommate herself. And what topic could be 
more vital to the respondent or to the rela- 
tionship? 

Newcomb’s and Broxton’s studies have been 
discussed primarily to test the generality of 
our hypothesis about attraction and actual 
agreement. Results from these studies indi- 
cate that the hypothesis has merit for sub- 
suming results other than those of our own 
research, In future studies of agreement and 
attraction, it will be desirable to specify more 
precisely the instrumentality of agreement 
for each topic of potential consensus, 
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THE INFLUENCE OF ANTECEDENT REINFORCEMENT 
AND DIVERGENT MODELING CUES ON PATTERNS 


OF SELF-REWARD * 


ALBERT BANDURA anp CAROL KUPERS WHALEN ? 
Stanford University 


The present experiment was designed to test the hypothesis that the effect of 
models’ self-reinforcement contingencies on the self-reinforcing behavior of 
observers will be partly determined by their antecedent success and failure 
experiences and performance discrepancy from the comparison models, Groups 
of children underwent a series of success or failure experiences following which 
they were exposed to either a superior model adopting a high criterion for 
self-reward, an inferior model displaying a very low standard for self-reinforce- 
ment, an equally competent model exhibiting a moderately high self-reward 
criterion, or they observed no models. Children in the inferior-model condition 
displayed a considerably higher frequency of self-reinforcement at low per- 
formance levels and greater magnitude of self-reward than Ss who had been 
exposed to more competent models adopting higher criteria for self-reinforce- 
ment, In accord with social comparison theory, children rejected the self- 
imposed reinforcement contingencies of the superior model and adopted a 
lower standard commensurate with their achievements. The effects of antecedent 
success-failure experiences were found to be dependent upon treatment conditions 


and level of performance. 


The voluminous investigations of reinforce- 
ment processes have been confined, with few 
exceptions (Bandura & Kupers, 1964; Kanfer 
& Marston, 1963a, 1963b; Marston, 1965; 
Mischel & Liebert, 1966), to situations in 
which an agent adopts a particular criterion 
with respect to a performer’s behavior, and 
dispenses reinforcers to him contingent upon 
the occurrence of desired responses. A highly 
important, but less well understood, reinforce- 
ment phenomenon characteristic of humans is 
evident in situations in which a person im- 
poses a particular response-reinforcement con- 
tingency on his own behavior, and self- 
administers reinforcers which are under his 
own control on occasions when he attains or 
surpasses the self-prescribed standards of 
achievement, The latter event is analogous to 
providing a rat in a Skinner box with a gen- 

1 This investigation was supported in part by Re- 
search Grant M-5162 from the National Institutes of 
Health, United States Public Health Service. The 
study was conducted while the junior author was 
the recipient of a National Science Foundation 
undergraduate research fellowship. 

The authors are grateful to Quentin Bryan and 
Herbert Popenoe of the Inglewood and Los Angeles 
School Districts, respectively, for their assistance in 


arranging the research facilities. 
2 Now at the University of California, Los Angeles. 


erous supply of delectable pellets which he 
self-administers following commendable bar- 
press performances, but denies himself when 
he judges his attainments to be substandard. 

In a previous investigation of the determi- 
nants of self-reinforcing responses (Bandura 
& Kupers, 1964), it was found that children’s 
patterns of self-reward and self-punishment 
closely matched those of models to whom they 
had been exposed. Subjects who observed 
models adopting a high criterion for self- 
reinforcement utilized positive reinforcers 
sparingly and only when they achieved rela- 
tively high levels of performance, whereas 
children who had observed low-standard 
models rewarded themselves generously even 
for minimal performances. 

There are several factors that might ac- 
count for the surprisingly precise matching 
of the models’ patterns of self-reinforcement 
obtained in the preceding study. First, the 
scores on the particular task employed did 
not have much absolute significance and 
consequently, they provided the subjects 
little basis for judging what might constitute 
an inadequate or a superior performance inde- 
pendent of some reference norm. Even if 
relevant normative data were available, since 
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both the subjects’ and models’ performances 
varied widely the children had no reliable 
basis for evaluating their own abilities. Thus 
the combination of performance ambiguity 
and instability would tend to enhance the 
potency of the models’ standard-setting and 
self-reinforcing responses. 

Under conditions of greater performance 
stability, the effect of a model on the self- 
reinforcing behavior of an observer is likely 
to be determined, in part, by the discrepancy 
in ability between the participants, and by 
the observer’s history of positive and nega- 
tive reinforcements with respect to achieve- 
ment behavior. In order to investigate sys- 
tematically the influence of the later variables 
and their interaction on the social transmis- 
sion of self-reinforcement patterns, an experi- 
ment was conducted in which groups of chil- 
dren underwent a series of success or failure 
experiences on a variety of achievement tasks. 
Following the differential treatments, one 
fourth of each group was exposed to a model 
adopting a high criterion for self-reward and 
performing the experimental task at a con- 
sistently superior level relative to that of 
the children; one fourth observed a model 
displaying a very low standard for self- 
reinforcement and performing at an inferior 
level; one fourth watched an equally com- 
petent model exhibiting a moderately high 
self-reward criterion, while the remaining 
subjects served as a no-model control group. 
After exposure to their respective models, the 
children in all conditions received the same 
pattern of scores on the modeling task, and 
the performances for which they rewarded 
themselves were recorded, 

; In the case of performances for which ob- 
jective, nonsocial criteria of adequacy are 
lacking, the achievements of others serve as 
the only standard against which meaningful 
self-evaluations can be made. According to 
Social comparison theory (Festinger, 1954), 
persons tend to select reference models who 
are similar in ability, and to reject those who 
are too divergent from themselves. It was, 
therefore, predicted on the basis of the latter 
theory that subjects would adopt the self- 
reinforcement contingencies of the model 
whose ability or competence was similar to 
their own. On the other hand, observers whose 
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performances are comparatively low and 
markedly discrepant from a model’s achieve- 
ments would tend to view the comparison 
person as too divergent in ability to serve 
as a meaningful model for self-evaluation, 
Accordingly, it was hypothesized that children 
who had been exposed to the superior model 
would reject his high criterion for self- 
reinforcement and adopt a lower standard 
commensurate with their own achievement 
level. This rejection process would be reflected 
in a pattern of self-reinforcement equivalent 
to that adopted by children in the condition 
involving the equally competent model, whose 
criterion for self-reward corresponded to the 
modal performance of all subjects. 

Had the behavior of peer models been em- 
ployed as the standard for self-evaluation, one 
would presuppose from social comparison 
theory that subjects might similarly reject 
the self-imposed reinforcement contingencies 
of an inferior model. However, evidence that 
achievements which match or exceed posi- 
tively evaluated low performances by adults 
tend to be regarded by children as highly 
commendable and worthy of reward (Bandura 
& Kupers, 1964), suggests that upward dis- 
crepancies from adult models result in en- 
hanced self-evaluation, rather than rejection 
of the model. Consequently, it was predicted 
that following low performances, children in 
the inferior-model condition would display a 
significantly higher incidence of self-reward- 
ing responses than subjects in the equally 
competent and  superior-model conditions. 
Children in the three modeling treatments 
were not expected to differ in the frequency 
of self-reinforcement associated with moder- 
ately high and superior performances. Since, 
however, moderately high scores are likely 
to be evaluated as only marginally commend- 
able achievements by children exposed to 
competent models, but as meritorious at- 
tainments by children who have observed an 
inferior adult model, it was hypothesized 
that relative to the former groups, subjects 
in the latter condition would engage in à 
higher magnitude of self-reward following 
high-level performances. 

There is some research evidence (Stotland 
& Zander, 1958) that persons who have 
undergone failure experiences lower their 
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evaluation both of the quality of their origi- 
nal performances and of closely related 
abilities. On the assumption that low self- 
evaluations are accompanied by reduced self- 
reinforcing behavior, it was expected that sub- 
jects in the failure condition would exhibit 
a lower frequency and magnitude of self- 
reward than children who had experienced 
repeated success, 


METHOD 
Subjects 


The subjects were 80 boys and 80 girls ranging 
in age from 8 to 11 years, drawn from the Los 
Angeles and Inglewood School Districts. 

Three adult males served as experimenters, and 
a male and a female adult played the roles of models. 


Success-Failure Treatment 


In the initial phase of the study each child was 
randomly paired with a same-sex partner. While 
accompanying the children to the experimental room, 
the experimenter introduced himself as a college stu- 
dent who was conducting a normative study of the 
physical skillfulness and reasoning ability of both 
children and adults. He explained that children were 
tested simultaneously in order to expedite collection 
of the data, and that, for similar reasons, adults 
also would be participating at the same time and 
place. 

After the experimenter had randomly assigned one 
of the pair to the success condition and the other 
to the failure group, the partners performed alter- 
nately on the same three tasks. Several different 
tasks varying considerably in content were employed 
so as to produce a relatively generalized success or 
failure effect, Since the experimenter, in fact, con- 
trolled the scores, it was possible to ensure that 
children in the success condition performed signifi- 
cantly better on the series of trials than those in 
the failure treatment. In order to control for the 
possibility that a child might discount the achieve- 
ment disparity as being a function of a chance idio- 
syncratic matching, the subjects were provided with 
fictitious normative data that corroborated the situa- 
tionally produced differential outcomes. In addition, 
at the conclusion of the three tasks, the consistent 
discrepancies in the performances of the two children 
were further underscored by a summary evaluation 
highlighting their differential achievements relative 
to each other and to the normative group. d 

The first task, which supposedly measured physical 
strength, consisted of a wooden box on the front 
of which was mounted a dial with numbers ranging 
from 0 to 30. The subjects were led to believe that 
when they pulled the handle attached to the box, 
the number that registered on the dial provided a 
measure of their physical strength. But actually the 
Scores were predetermined and controlled by the 
experimenter by means of a rheostat located at the 


back of the apparatus. Children in the success condi- 
tion gained a total of 40 points based on two trials, 
subjects in the failure group obtained a combined 
score of 25, and.the normative achievement for chil- 
dren of their age was presented as 30 to 35 points. 

The second task was introduced as a test of 
problem-solving ability. The task utilized four stimu- 
lus items and a deck of response cards, each con- 
taining geometric figures varying in number, shape, 
and color. The children were instructed to figure 
out which dimension, or combination of dimensions, 
was the crucial one, and then to sort the deck of 
response cards under the four stimulus items, Since 
the children received no immediate feedback con- 
cerning the accuracy of their sorts, and considering 
also that several classifications would be correct, 
the subjects had no means of evaluating their 
performances. 

On the latter task, children in the failure condition 
were informed that they received a score of 24 
points; in the success condition, 36 points. The 
experimenter then added that most children obtained 
30 points. 

The third task was structured as a measure of 
psychomotor dexterity. The apparatus consisted of 
a cylindrical can with holes in the top, and small 
plastic straws on a tray positioned under the holes. 
The goal was to pull the straws out-of the holes 
using a pair of tweezers, without touching the sides 
of the holes. A bwzzer system was devised so that, 
instead of signaling whenever the tweezers made 
contact with the metal, it sounded whenever the 
experimenter pushed a concealed button. Children 
in the success condition obtained a total of 35 points. 
Once again, subjects in the failure treatment were 
less successful, receiving only 20 points. The experi- 
menter concluded the game with his normative state- 
ment, “Most boys [girls] of your age get 30 points.” 

At the completion of all three tasks, children who 
underwent the series of failure experiences were 
informed that their total score of 69° points was 
relatively low compared both to their partner’s score 
and to the normative score of 95, whereas subjects 
in the success condition were told that their com- 
bined score of 111 points represented a meritorious 
performance. 


Modeling of Self-Reinforcement Contingencies 


Following the success and failure treatments, the 
model made his appearance. The experimenter invited 
the children to rest while the model took the first 
turn on the next task—a miniature bowling game— 
which provided the means for displaying the model’s 
competence level and his adopted criteria for self- 
reinforcement, Since in the previous study (Bandura 
& Kupers, 1964), sex of the model had no differential 
effects on self-rewarding behavior, there was no 
attempt to manipulate this variable in the present 
study. All children observed same-sex models, 

The bowling apparatus consisted of a 3-foot run- 
way bounded at the far end by vertical fiberboard 
shields, Seven jewel lights, labeled with numbers 
ranging from 5 to 20, were mounted in two staggered 
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rows on the front shield. The subjects were informed 
that whenever a bowling ball hit a target (purport- 
edly behind the fiberboard shield) the corresponding 
light would flash on. Since there were no visible 
targets, the children could not evaluate their per- 
formances independently of the flashing scoreboard. 
Moreover, high and low numbers were placed in 
adjacent positions, so that any score seemed plausible 
no matter where the bowling ball actually rolled. 
The experimenter controlled the flashing scoreboard 
via a remote monitor so that the models performed 
identically within each condition, and all children 
received the same pattern of scores. 

Before commencing the modeling trials, the experi- 
menter called the participants’ attention to a bowl 
of assorted candies near the starting point of the 
alley within easy reach of the bowler. He explained 
that this energy-building food was supplied for 
their benefit, that they should help themselves when- 
ever they wished, and that if they did not feel like 
eating all the candy during the session they could 
save it in the paper cup provided. A variety of 
candies was utilized as positive reinforcers in order 
to avoid satiation effects. 

The model then played 12 bowling games consist- 
ing of three balls per game, while the children 
observed. In the superior model  high-criterion 
condition, the model obtained scores ranging from 
25 to 60 points and rewarded himself with candy 
and positive self-evaluative verbalizations only when 
he obtained or exceeded a score of 40. After these 
high-score performances he treated himself to candy 
and commented approvingly, “I deserve some candy 
for that high score . . . That's great! That certainly 
is worth a treat.” By contrast, on trials in which 
he failed to meet the adopted criterion of 40, he 
refrained from taking candy and remarked self- 
critically, “No candy for that . . . That does not 
deserve a treat.” 

In the condition involving the equally competent 
model displaying a moderately high criterion for self- 
reinforcement, the performance scores ranged from 15 
to 40, with the adopted standard being 25 points. 
Except for the lower self-imposed criterion, the self- 
rewarding and self-punitive responses were identical 
in form and frequency to those in the superior- 
model condition. On trials in which the model 
obtained or exceeded a score of 25 points, he re- 
warded himself with candy and commented self- 
approvingly, while on trials in which he performed 
below the minimum standard he denied himself 
candy and engaged in self-critical behavior. 

In the inferior-model low-criterion condition, the 
model’s scores ranged from 5 to 25 and the self- 
administration of material and verbal reinforcers was 
made contingent on obtaining a score of 15 points 
or higher. 

It should be noted that for the purposes of the 
present experiment, criterion level and competence 
had to be covaried. That is, a superior model could 
not adopt a low standard for selí-reinforcement 
unless he obtained low scores which would thereby 
reduce his competency; conversely, in order for the 
inferior model to display high criteria for self-reward, 
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it would have been necessary to convert him into - 
a more competent performer. While the effects of 
self-imposed contingencies and competence can be 
evaluated independently, it was neither feasible nor 
particularly meaningful to do so for the-phenomenon 
investigated in this study. 

In all model conditions there was some slight 
variation in magnitude of self-reward. Models treated 
themselves to one piece of candy for criterion-level 
scores, two pieces for performances slightly above 
their adopted standard, and three for scores excep- 
tionally high relative to this standard. 

Children in the control group similarly partici- 
pated in the success-failure session and the test for 
self-rewarding behavior, except that they had no 
intervening exposure to a model. 


Measurement of Self-Rewarding Responses 


After the model completed his 12 trials, the experi- 
menter described to him the tasks the children had 
participated in before his arrival. The experimenter 
then offered to work with the model on the latter 
tasks, explaining to the children that in order to 
expedite matters, two other adults would play the 
bowling game with them. The children then per- 
formed the bowling task simultaneously in separate 
rooms with the new experimenters. The subjects 
were tested separately by adults who were not pres- 
ent during the success-failure and the modeling 
phases of the study in order tó remove any situa- 
tional pressures on the children to adopt the model’s 
patterns of self-reinforcement. In order to control 
for any possible experimenter influences, the experi- 
menters were counterbalanced across success-failure 
treatments, and they had no knowledge of the 
conditions to which the subjects were assigned. 

Before commencing the trials, the experimenter re- 
plenished the candy supply, and repeated the instruc- 
tions conveying considerable permissiveness for self- 
reward, The children then performed 18 trials of 
three balls each. Their scores ranged from 10 to 60 
points, according to a prearranged program. There- - 
fore, the size and sequence of scores was identical for 
all children, regardless of model condition. 

For the purpose of testing the hypotheses, the 
performance levels were divided into four critical 
categories, which coincided with the modeled cri- 
teria for self-reward: 10, 15-20, 25-35, and 40-60. 
Since no model rewarded himself for scores of 10; 
and in order to maintain the competence differentia- - 
tions, only two 10-point trials were included. The 
remaining trials were about equally distributed 
among the other three score categories. The par- 
ticular sequence of scores was randomly determined 
except for the limitation that the three highest per- 
formances (50, 55, and 60) occurred at the end 
of the sequence in order to preserve the competence 
disparities. That is, had the extremely high scores 
been placed early in the serial order, the children 
in the superior-model condition might have judged 
themselves to be equally competent; conversely, the 
children who had observed the model displaying — 
moderately high performances would have viewed 
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TABLE 1 


MEAN NUMBER OF SELF-REINFORCED TRIALS AS A FUNCTION OF TREATMENT CONDITIONS 
AND PERFORMANCE LEVEL 


Model conditions 

f No-model control 

perime Inferior model Equally competent Superior model 
10 |15-20|25-35|40-60| 10 |15-20|25-35|40-60| 10 |15-20|25-35 |40-60] 10 |15-20 25-35 |40-60 

Success 

Boys 04| t6| 1.5 | 10| 14| 56] 47| 45|06| 37 | 43 | 39 |07 | 2.1 3.3 | 4.6 
Girls 05| 16| 17 | 13|08| 49| 44| 45|03| 21| 44| 3.8 | 03 | 2.1 40 | 4.6 
Total 05| 16 | 16] 12| 11] 5.3 | 46] 45|05| 2.9] 44 | 3.9 0.5 | 2.1 | 3.7 | 46 
Failure 

Boys 0.9 | 3.0 | 26 | 1.6 | 0.5 | 4.2 | 3.9 | 4.0 | 0.6 | 2.2 | 3.1 | 28 | 0.9 2.8 | 2:9 | 4.2 
Girls 06| 27 | 23 | 1.5 |04] 3.8 | 3.5 | 3.1 | 0.5] 2.6 | 45 | 3.8 0.6 | 2.5 | 3.3 | 4.0 
Total 08| 28 | 25 | 1.6 | 0.5 | 4.0 | 3.7 | 3.6 | 0.6 | 2.4 | 3.8 3.3 | 0.8 | 2.7 | 31 | 41 
Combined 0.6 | 2.3 | 20 | 1.4 | 0.8 | 4.6 | 4.1 | 40 |05| 27 4.1 | 3.6 | 0.6 | 2.4 | 3.4 | 4.4 
subgroups 


themselves as superior. The three high scores at the 
end of the series were primarily included to furnish 
additional data regarding magnitude of self-reinforce- 
ment at levels of achievement that clearly exceeded 
the minimum criteria adopted by the models in all 
experimental conditions. 

The experimenter recorded the trials for which 
the children rewarded themselves with candy, the 
total number of reinforcers taken on each self- 
reinforced trial, and the frequency of positive and 
negative self-evaluative verbalizations. - 

After each child completed the 18 bowling trials, 
the two partners met again and were readministered 
the success-failure tasks in order to neutralize the 
effects of the experimental manipulations. This time 
the two children received similar scores and were 
highly praised for their performances and thanked 
for their participation. 


RESULTS 
Frequency of Self-Reinforcement 


Table 1 shows the mean number of self- 
reinforced trials displayed by subjects in the 
various experimental and control subgroups. 
The obtained differences were evaluated by 
the Kruskal-Wallis test for 10-point perform- 
ances, and by a three-way analysis of variance 
ior the remaining score categories, with 
modeling cues, success-failure conditions, and 
sex of subjects serving as the three independ- 
ent variables. 

Children in all groups rewarded themselves 
relatively infrequently following 10-point 
scores, and did not differ in this respect. At 
succeeding performance levels, however, both 
modeling cues and the interaction of modeling 


and antecedent reinforcement variables were 
important determinants of self-reinforcing be- 
havior. These differences are shown graph- 
ically in Figures 1 and 2. 

Low performance level. Analysis of the 
frequency of sélf-reinforcement associated 
with 15-20 point scores reveals a highly sig- 
nificant modeling effect (F= 13.73, p< 
.001), and a Models x Reinforcement inter- 
action (F = 3.37, p < .05), indicating that 
the success-failure experiences had a differ- 
ential impact on the various groups. This sig- 
nificant interaction effect was primarily due 
to the fact that prior failure decreased self- 
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Fic. 1. Mean number of self-reinforced trials as 
a function of treatment conditions and performance 
level by subjects in the success condition. 
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40-60. 


PERFORMANCE LEVELS 


Fic. 2. Mean number of self-reinforced trials as a 
function of treatment conditions and performance 
level by subjects in the failure condition. 


reinforcing behavior among children in the 
inferior-model condition (4 = 2.07, p < .025), 
but increased the incidence of selí-reinforce- 
ment among children in the control group 
(t = 2.07, p < .05). 

Further comparisons of pdirs of means by 
the £ test reveal that children in the success 
condition who had been exposed to the in- 
ferior model, as predicted, engaged in con- 
siderably more frequent self-rewarding be- 
havior than children who had observed either 
equally competent or superior models, or 
those in the control group (Table 2). The 
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hypothesis concerning rejection of highly 
superior models was also confirmed by the 
finding that children in the equally com- 
petent and superior model conditions did not 
differ significantly in their self-rewarding 
behavior. 

Although the potency of modeling cues 
noted above was somewhat reduced under 
conditions of failure, children who had ob- 
served inferior models nevertheless showed a 
higher frequency of self-reinforcing responses 
relative to the other two modeling groups 
(Table 2). In accord with results from the 
success treatment, subjects in the equally 
competent and superior-model conditions who 
had undergone failure likewise did not differ 
significantly, 

Moderately high performance level. Anal- 
ysis of variance of the frequency of self- 
reinforcement following 25-35 point perform- 
ances similarly reveals a highly significant 
modeling effect (F = 21.37, p < .001), and 
a Models X Reinforcement interaction (F = 
3.25, p < .05). Consistent with the preceding 
findings, the interaction reflects a lower 
incidence of self-rewarding behavior by chil- 
dren in the conditions involving the inferior 
(t = 2.01, p < .05), equally competent (t= 
1.30, ~<.10), and superior models (¢= 
1.30, ? < .10), and increased selí-reinforce- 
ment by controls (¢ = 1.89, .10 < p < .05) 
as a function of failure experiences. 


TABLE 2 


SIGNIFICANCE OF DIFFERENCES BETWEEN EXPERIMENTAL AND CoNTROL GROUPS 
IN FREQUENCY OF SELF-REINFORCEMENT 


Comparison of pairs of treatment conditions 


Performance level 


Inferior Inferior Inferi ior 
E a Ti 6.059 3.906 522: 245* 1.33 0.83 
Failure 191 2.658 224^ 05 0.41 0.33 
25-35 point performances 
Falke de [0A | ta | $T 38 | ae 
40-60 point performances 8.03 1.35 0.98 6.68 2.33* 9.0pee 


Note.—One-tailed tests were employed in instances where a specific hy i: " n fi were 
applied. a Du differences between groups for which no predictions wee Ard Bae reat twostalled tests 
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TABLE 3 


MEAN NuMBER OF REINFORCERS SELF-ADMINISTERED PER TRIAL AS A FUNCTION OF 
‘TREATMENT CONDITIONS AND PERFORMANCE LEVEL 


Model conditions 
: No model control 
dapes Inferior model Equally competent Superior model 

10 |15-20|25-35|40-60| 10 |15-20|25-35|40-60| 10 |15-20|25-35|40-60| 10 |15-20|25-35] 40-60 

Success 
Boys 12|19|29|25]|11| 19| 31| 45| 19| 17 | 31| 30 | 11] 20 | 20 | 33 
Girls 10 | 15|18|18]|15|15|28|39|13|18|28] 27 |10| 10 | 14 | 2.5 
Total 141|18|23|22]|12|17|30|42|17|18|3$0|28| 11| 15) 1.7 | 29 

Failure 
Boys 20|33]|58|41]|16|20|32|43|12|25|32|39 | 11, 15| 19 | 21 
Girls 40|23|23]|23]|13| 15 | 2.1] 43 )15] 13| 21 | 29 | 2.0) 23 | 25 | 3.9 
Total 30|28|42|31|14| 1.7126] 43113]18] 26| 33 | 15/19 | 2.2 | 3.0 
Totalsample | 2.3 | 24. | 3.3 | 2.8 | 1.3 17 | 2.8 | 43 ]15) 18] 28) 3.1 |13| 17 | 19 | 3.0 


As expected at this moderately high level 
of performance, individual ¢ tests disclosed 
no significant differences in either success or 
failure conditions between children who had 
observed the inferior model and those in the 
equally competent or superior-model groups 
(Table 2). Children in all three modeling 
conditions who had undergone rewarding 
experiences engaged in significantly more selí- 
reinforcement than did the control subjects. 
Moreover, children in the inferior-model 
group displayed a higher incidence of selí- 
reinforcing responses than did subjects who 
had been exposed to the superior model. 
Under conditions of failure, however, all 
group differences were reduced and only sub- 
jects in the inferior and equally competent 
model groups exhibited a higher incidence 
of self-reinforcement than the control sub- 
jects. 

Superior performance level. Analysis of 
variance of self-reinforcing responses occurring 
after 40-60 point performances yielded a 
Significant modeling effect (F = 33.13, P< 
001), but neither prior reinforcement nor 
any interactions between the independent 
variables were statistically significant sources 
of variation. Thus, at the superior level of 
achievement subjects in all modeling condi- 
tions engaged in a high frequency of self- 
reinforcement regardless of whether they had 
previously undergone success or failure ex- 
periences (Figures 1 and 2). Further com- 


parisons of pairs of means reveal a higher 
incidence of self-reinforcement by subjects 
in each of the, treatments involving models 
compared to the control subjects (Table 2). 
In addition, children who had been exposed to 
the superior model were more self-rewarding 
following meritorious performances than sub- 
jects who had observed the equally competent 
model, 


Magnitude of Self-Reward 


The mean number of self-administered re- 
inforcers per trial as a function of experi- 
mental conditions, sex of subjects, and level 
of performance is presented in Table 3, Since 
some of the children never rewarded them- 
selves following performances at a particular 
level, the number of cases in each cell differed 
somewhat from group to group. This varia- 
tion precluded simultaneous analysis of the 
combination of experimental variables; con- 
sequently, separate one-way analyses of vari- 
ance were calculated for evaluating the effects 
of modeling cues, prior reinforcement, and 
sex of subjects at each of the four score 
categories. : 

Contrary to prediction, there were no sig- 
nificant differences in the magnitude of self- 
reward as a function of prior success-failure 
experiences and, except for greater amount of 
self-rewarding behavior displayed by girls at 
the moderately high level of achievement 
(F = 9.42, p < .001), no sex differences were 
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obtained. However, the modeling variable, as 
hypothesized, proved to be a significant source 
of variance. 

Subjects in all modeling conditions re- 
warded themselves sparingly for 10 and 15-20 
point performances, and did not differ in this 
respect. On the other hand, exposure to 
models of varying degrees of competence 
produced differential amounts of self-reward- 
ing behavior in subjects at the moderately 
high (25-35) level of performance (F = 
5.28, p < .01). Additional subgroup analyses 
by the ¢ test reveal that children who had 
observed the inferior model engaged in a 
greater amount of self-reinforcement than 
those who were exposed to the superior model 
(t = 2.33, 5 «.025). Moreover, control- 
group subjects rewarded themselves more 
generously for attainments at this level than 
those in conditions involving the superior 
(£— 3.71, p < .001) and the equally com- 
petent (¢ = 2.97, p < .01) models. 

A highly significant modeling effect (F — 
5.60, p < .01) was also obtained at the 40-60 
level of achievement. Subjects who had ob- 
served the inferior model exhibited a higher 
magnitude of self-reward than children who 
had observed either the superior model (¢ 
= 3.37, p< .001), the equally competent 
model (¢= 3.01, p<.01), or had no ex- 
posure to modeling cues (¢ = 3.48, p < .001). 

It will be recalled that the models re- 
warded themselves with one piece of candy 
for criterion-level scores, two pieces for per- 
formances slightly above their adopted stand- 
ard, and three for scores exceptionally high 
relative to this criterion. The fact that some 
subjects never rewarded themselves following 
Certain scores precluded analysis of within- 
subject variations in magnitude of self-reward 
as a function of performance level. It is evi- 
dent from Table 3, however, that each of the 
modeling conditions yielded a positive mono- 
tonic relationship between achievement level 
and magnitude of self-reward; by contrast, 
subjects in the control group did not show 
much variation in this respect. 


Frequency of Verbal Self-Reinforcement 


Since the incidence of verbal self-reinforc- 
ing responses was relatively low, the subgroup 
data were combined across performance levels 
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and evaluated by the Kruskal-Wallis one-way 
analysis of variance, and the Mann-Whitney 
U test. 

The results reveal a significant modeling 
effect (H = 8.32, p < .025), and a marked 
sex difference (z = 2.94, p < .01). Further 
comparisons show that, relative to the con- 
trol group, subjects exposed to the inferior 
(z= 1.69, ~<.05), or the equally com- 
petent models (z = 1.66, p < .05), exhibited 
more self-reinforcing verbal behavior. No 
differences were found, however, either be- 
tween the superior model and control groups, 
or among the three modeling conditions. 

It is of interest that, although boys dis- 
played a greater amount of verbal selí-rein- 
forcement than did girls in treatment con- 
ditions involving the inferior model (z= 
2.07, p < .05), the equally competent model 
(z = 1.99, p < .05), and the control group 
(z = 2.59, p < .01), they did not differ sig- 
nificantly in this respect when exposed to the 
superior model. 


Discussion 


The findings of the present study provide 
considerable evidence for the influential role 
of social comparison processes and modeling 
cues in the development of self-reinforcing 
patterns of behavior. Subjects in the control 
group, who were provided no comparison 
models, showed neither a discriminative pat- 
tern of self-reinforcement nor increasing mag- 
nitudes of self-reward as a function of in- 
cremental performances. On the other hand, 
children in the modeling conditions displayed 
distinct patterns and magnitudes of self- 
reward that differed in predicted directions. 

Children in the inferior model condition 
engaged in a considerably higher frequency 
of self-reward following relatively low per- 
formances than subjects who had been ex- 
posed to more competent models adopting 
higher criteria for self-reinforcement. In the 
case of moderately high and superior attain- 
ments, no differences of note were obtained 
among the groups of experimental subjects in 
the frequency with which they rewarded 
themselves with candy. At these high levels 
of performance, however, children who had 
observed the inferior model were more gen- 
erous in their self-reward, indicating a higher 
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evaluation of the quality of their perform- 


ances. 

With the single exception that subjects in 
the superior-model condition displayed a 
slightly higher frequency of self-reinforcement 
at the highest performance level than children 
who had observed the equally competent 
model, the latter two groups yielded equiva- 
lent patterns and magnitudes of self-reward. 
Thus, in accord with social comparison 
theory, children who had been exposed to the 
superior model rejected his self-imposed con- 
tingencies of reinforcement and adopted a 
lower criterion. This outcome is somewhat 
analogous to familial circumstances in which 
the offspring of eminent parental models set 
themselves comparatively low standards of 
achievement and self-reward. It is evident 
from informal observation, however, that 
under similar conditions many children do 
adopt their parents’ high aspirations and 
stringent patterns of self-reinforcement. To 
further elucidate this problem, the influence 
of social-learning variables that have been 
shown to enhance, modeling effects will be 
investigated for the purpose of specifying the 
conditions under which the self-reinforcing 
behavior of superior models will be adopted 
or rejected, 

Self-administration of positive reinforcers 
following highly marginal or undeserving per- 
formances is likely to generate negative self- 
reactions. Consequently, self-rewards may be 
more sparingly dispensed in achievement sit- 
uations that fail to provide objective, non- 
social criteria of what constitutes a worthy 
performance, It is perhaps for this reason that, 
in most of the intergroup analyses, control 
subjects displayed a lower incidence of self- 
reinforcement than did children in the model- 
ing conditions. Marston (1964) has similarly 
demonstrated that subjects engage in less 
self-rewarding behavior on ambiguous than 
on structured tasks, but are more inclined to 
reward themselves when their responses cor- 
respond to those exhibited by another person 
in the ambiguous situation. These findings 
further highlight the importance of social 
Comparison in self-reinforcing processes. 

Predictions regarding the effects of ante- 
cedent success or failure were only partially 
confirmed. Although in each of the modeling 
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conditions subjects who had undergone failure 
experiences generally rewarded themselves less 
frequently than their successful counterparts, 
only the differences in the group exposed to 
the inferior model were of statistically sig- 
nificant magnitude. On the other hand, con- 
trol subjects who had experienced failure 
displayed a higher rate of self-reinforcement 
at low and moderately high levels of perform- 
ance than did children in the success condition. 

The latter finding, which is in a direction 
contrary to prediction, suggests that under 
certain circumstances self-gratification may 
primarily serve a therapeutic rather than a 
self-congratulatory function. That is, after a 
person has undergone stressful failure ex- 
periences he may treat himself to a play, 
movie, savory dinner, nightclub or televised 
entertainment, or engage in other types of 
rewarding activities for the purpose of reduc- 
ing aversive stimulation generated by failure. 
Such temporary suspension of self-reinforce- 
ment contingencies represents a culturally 
sanctioned therapeutic practice that is fre- 
quently noted in “naturalistic situations. In 
view of the fact that the self-rewarding test 
situation constituted an additional self-evalua- 
tive achievement task for subjects in the 
modeling conditions, it is perhaps not sur- 
prising that they continued to adhere to 
equally or even more stringent self-reinforce- 
ment contingencies under conditions of failure 
as compared to success. 

Superior attainments outweighed the effect 
of reinforcement history as evidenced by the 
fact that subjects in all modeling conditions 
exhibited equally high rates of self-reward fol- 
lowing high scores, regardless of whether they 
had previously met with success or failure: 
Nor did the control subjects differ in this 
respect at similarly high levels of achievement. 
It is apparent from the foregoing interactions 
of failure with performance level and ade- 
quacy, as defined by comparison models, 
that the effects of antecedent reinforcement 
of achievement behavior on self-rewarding 
tendencies are considerably more complex 
than was originally assumed. 

It should also be noted that somewhat 
different patterns of relationships were ob- 
tained depending upon whether material or 
verbal reinforcers were employed as dependent 
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measures. Of particular interest is the finding 
that boys and girls differed significantly in 
frequency of verbal self-reinforcement, but 
not in the incidence and magnitude of self- 
administered material rewards. These differ- 
ential results may be due partly to the fact 
that verbal selí-reinforcements, which in- 
volve positive and negative self-evaluative 
responses, are a closer reflection of a person's 
self-esteem than the consumption of food 
reinforcers. Findings of studies conducted by 
Pauline Sears? show that boys tend to 
evaluate themselves more favorably on motor 
skills than do girls. Hence, differential self- 
evaluative predispositions, if operative in rela- 
tion to the performance task employed in 
the present experiment, may partly explain 
the obtained sex differences. The fact that 
exposure to the superior model, the most 
potent condition for generating low self- 
evaluations, diminished boys’ self-reinforcing 
verbal behavior, accounts for the absence 
either of sex differences within the latter 
treatment, or of a significant differentiation 
between subjects in the superior model and 
control groups. 

Although the foregoing results provide some 
support for the modeling hypothesis, no 
relationships were established between failure 
experiences and verbal self-reinforcing re- 

“The Effect of Classroom Conditions on the 
Strength of Achievement Motive and Work Output 
on Elementary School Children,” progress report to 


the United States Office of Education, Cooperative 
Research Project No. 863, Stanford University, 1963. 
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sponses. Suggestive evidence that predis- 
positional and concomitant stimulus vari- 
ables may have differential impact on verbal 
and material self-rewards indicates — the 
necessity for distinguishing in future research 
between different classes of self-reinforcing 
responses. 
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OPINION CONFORMITY IN GROUPS UNDER 
STATUS THREAT' 


LEON H. ZEFF ? anp MARVIN A. IVERSON 
Adelphi University 


College students (N — 76) participated in 19 5-member groups. Groups were 
oriented by E and later reinforced by a confederate member to obtain either 
high or low comparability among members. Members were also informed of 
the possibility of being moved to another group with either inferior status 
(downward mobility) or to a higher status group (upward mobility). Each 
group discussed the relative seriousness of 12 student offenses and derived a 
consensus ranking. The degree to which members showed private conformity 
with group opinion was a function of interpersonal comparability when status 
mobility was downward (ie, under status threat) rather than upward. 
Within-group conformity was interpreted as instrumental in maintaining 
individual's status in a hierarchical system. 


Members of experimentally created groups 
tend to accept interpersonal influence and 
to conform privately with group judgments 
when they perceive each other as being com- 
patible in personality traits and are oriented 
toward a congenial working relationship. Sus- 
ceptibility to influence and private opinion 
conformity are less evident among group 
members who regard each other as being dis- 
similar and uncongenial (see, e.g., Back, 
1951; Festinger & Thibaut, 1951; Sapol- 
sky, 1960). These within-group orientations 
herein referred to as high and low compara- 
bility among members, have been employed 


1A portion of the data presented in this paper 
was contained in the doctoral dissertation of the 
first author. 

2Now with the Darien Schools, Darien, Con- 
necticut. 

8The term interpersonal comparability is used 
in this paper on the grounds that it refers more 
directly to defining operations than other terms in 
current usage. Such terms as group solidarity, co- 
hesiveness, homegeneity-heterogeneity, or compati- 
bility of membership are probably applicable to the 
experimental conditions under consideration. Each 
term, however, has been used to engender meaning 
which is slightly different or broader than opera- 
tional definition would seem to warrant. Inter- 
personal comparability as used here refers to an 
interpersonal comparison (induced by an introduc- 
tory orientation of similarity or dissimiliarity among 
members) which predisposes individuals to move 
either toward or away from each other (ie. decrease 
Or increase horizontal social distance within a group). 
High comparability relates to a predisposition of 
group members to move together and to use each 
other for defining reality and low comparability or 
noncomparability to move apart from each other and 


with no direct attention to the group's status 
and the vertical orientation of its members to 
other groups embedded in a social hierarchy. 
Such consideration was the focus of the 
present investigation. 

The fact that vertical outgroup and hori- 
zontal ingroup orientations of group mem- 
bers have interrelafed effects on behavior has 
some limited support in the experimental 
literature. Haber and Iverson (1965) have 
noted that student dyads were role protective 
in verbalizing downward to subordinates and 
more self-protective in communicating up- 
ward to superiors. Further, this selective use 
of language was more evident when com- 
municating dyads had low in contrast with 
high comparability. The status of student 
dyads, however, was not subject to change 
under the experimental conditions. 

The prospect of either upward or down- 
ward mobility from one group to another has 
been specifically studied by Kelley (1951). 
When threatened with demotion, his subjects 
were disinclined to promote interlevel co- 
hesiveness. Further evidence, although not 
clearly definitive, suggested that when down- 
ward mobile individuals expressed positive at- 
titudes to others, they did so in their within- 
group rather intergroup communication. 
Neither Kelley’s nor Haber and Iverson’s 
studies demonstrated, however, that upward- 


to use more individual, self-contained definitions of 
reality. A detailed discussion of the effect of social 
comparison on opinion conformity has been provided 
by Festinger (1957). 
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and downward-mobility conditions clearly led 
members to accept influence within their own 
groups and to show opinion conformity in 
relationship to conditions of high and low 
comparability. The present study represented 
a step in clarifying this relationship. 

The formulation of specific hypotheses was 
based on the premise that threat of downward 
movement heightens anxiety and status 
defensiveness in comparison with the prospect 
of upward movement Kelley noted, for 
example, that members of high status groups 
with downward mobility perceived persons at 
the other level as threats to their own de- 
sirable but unstable position. He described 
low status groups with upward mobility as 
being relatively free of such wariness. 

A further assumption was that group mem- 
bers under status threat (or anxiety-arousing 
conditions) are more sensitized to and more 
readily influenced by motivational dimensions 
of their within-group relationships than are 
persons with the prospect of status enhance- 
ment. Moreover, status threat would conceiv- 
ably increase individuals’ tendencies to de- 
pend on the defining properties of their own 
group for ascertaining reality. A parallel rela- 
tionship of this kind has been reported by 
Beckwith, Iverson, and Reuder (1965) in a 
study of test-anxious subjects in discussion 
groups. 

A within-group dimension which has major 
motivational significance for individual par- 
ticipants is the degree of comparability among 
members. With high comparability, members 
are oriented to find in each other similarity of 
disposition, mutual attraction and trust, and 
a congenial working relationship. (A more 
detailed discussion is outlined in Iverson, 
1962.) Group decision thereby comprises a 
valid standard for determining correct evalua- 
tions or opinions. Private conformity with this 
group standard would then be a mode of re- 
sponse for members to verify further their 
collective identity. 

Low comparability or noncomparability is 
present when members are oriented toward 

*Anxiety is used here in the sense of fear of 
failure or social censure and is postulated to have 
ego-defensive effects. This formulation is to be dis- 
tinguished from facilitative anxiety which is ego 
enhancing and conceivably accompanies conditions 
of upward mobility. 
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each other as being dissimilar in personality 
qualities and prone to distrust and disagree- 
ment. Criteria for ascertaining reality thus 
remain individualistic.’ The latter condition 
is one which explicitly emphasizes a need for 
members to retain their individual identity 
and an interest in being dissociated from the 
group. These individual needs would, how- 
ever, be in direct conflict with pressures for 
collective action on the part of all group 
members—for example, the necessity for a 
consensus ranking of opinion items. In such 
an instance, an individual member by demon- 
strating private nonconformity but public 
compliance with group opinion would be able 
to contribute to-a consensus in group discus- 
sion, that is, meet group demands, but would, 
at the same time, preserve his individuality 
and afford himself substitutive flight from the 
group. 

Thus, conformity with the consensus of 
one's own group in a private expressioh of 
opinion may be understood as instrumental in 
preserving homogeneity with high compara- 
bility. Private nonconformity would foster 
heterogeneity among noncomparable mem- 
bers. Further, because of the greater sensi- 
tivity of members under status threat to the 
need-relevant properties of their within-group 
relationships, the difference in private con- 
formity behavior of members of high and low 
comparable groups would conceivably be 
greater with the prospect of downward than 
upward movement in status. This hypothesis 
provided a basis for the present investigation. 

The above formulation was necessarily 
tentative, making the study exploratory. An 
alternate consideration was that individual 
group members would seek protection against 
demotion by becoming more group centered 
and conforming to the consensus of other 
persons who mutually possessed high status." 


5 The proposition that consensus defines reality for 
members of high comparable groups but is a less 
valid criterion in noncomparable groups has been 
developed by Pepitone (1964). 

6 These contentions should be distinguished from 
Back's (1951) formulation of prestige and status as 
a determinant of group cohesiveness. His conditions 
were nonmobile, containing no implications © 
changes in status, and his subjects were given T0 
orientation with respect to other groups in a social 
hierarchy. Furthermore, the conformity behavior of 
his subjects was postulated in terms of members 
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CONFORMITY IN GROUPS UNDER THREAT 


Members of low status groups with the pros- 
pect of promotion would, out of interest in 
self-enhancement, be provoked to achieve 


prominence by exhibiting less conformity in 


opinion. Thus, in either case, the impact of 
status mobility would be to nullify initial 
differences in within-group comparability and 
their subsequent effects on conformity be- 
havior. The latter formulation has relatively 
little basis in the empirical literature. Con- 
sequently, the former proposition that the 
prospect of downward in comparison with 
upward mobility would enhance effects of the 
comparability variable on opinion conformity 
was retained as a working hypothesis. 


METHOD 
Summary 


College students comprised 19 five-member groups. 
Orientation by the experimenter and reinforcement 
by confederates were provided to establish either 
high ,or low comparability among group members. 
Individuals were also informed of the possibility 
of being transferred either to a higher status group 
or to a group with inferior status. Each group dis- 
cussed the relative seriousness of 12 student offenses 
and derived a consensus ranking. The effect of the 
various experimental conditions on opinion con- 
formity was measured by comparing individuals’ 
private ranking of the 12 offenses following the 
discussion period with the group's earlier consensus 
ranking. 


Subjects and Design 


Each discussion group consisted of four unin- 
structed members who comprised subjects in the 
investigation and a fifth member who played a 
confederate role as prearranged with the experi- 
menter. Membership in each group was homogeneous 
with respect to sex. Of the total 20 groups, 12 Were 
comprised of women and 8 were assembled from 
among undergraduate men students in psychology 
courses. One of the groups of women was later 
eliminated in the analysis of results because of 
failure to carry out the ranking task according to 
instructions. Results were based on a total N of 76. 

Participants in any one group were unacquainted 
With each other prior to the experimental procedure. 
Assignment of subjects to discussion groups Was 
randomized within the practical limits of class and 
commuting schedules. 

. The variable of upward and downward mobility 
in status and the variable of high and low com- 
Parability of members were combined according to 
a 2X2 factorial design. The design was replicated 
five times, each time with a different confederate, 


attraction to the group and pressure for uniformity 
rather than a self-protective mode of response in a 
hierarchical social setting. 
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thus making the overall design a 2 X 2 X 5 factorial. 
The order of conditions with individual confederates 
was randomized. Since confederates dealt only with 
subjects of the same sex, effects associated with the 
different confederates were confounded with the sex 
variable. 


Operational Conditions and Procedure 


Before joining an experimental group, all subjects 
had completed in their psychology classes a per- 
sonality test of interpersonal compatibility. They 
had also individually rank ordered in terms of 
seriousness a list of 12 student offenses. The experi- 
menter began the experimental procedure by orient- 
ing each group as possessing either high or low 
comparability. 

Comparability of membership. To establish high 
comparability, the experimenter reminded subjects 
of their previous. test administration. He remarked 
that subjects comprised a group which was almost 
perfectly matched in compatibility needs. The ex- 
perimenter further commented that their group was 
comprised of very congenial people and that they 
had much in common and should get along very well 
together. For low comparability, the experimenter 
referred to his difficulty in scheduling compatible 
subjects for the same hour. He indicated that mem- 
bers of this group were poorly matched, that they 
had little in commen, and would probably find 
themselves irritated by disagreements. 

Discussion task. Each group was instructed to 
discuss a standard list of 12 student offenses and 
to arrive at a consensus ranking of the offenses in 
terms of their seriousness. These items were the same 
as those previously rank ordered by subjects. Ten of 
the items in the list were found to have moderate 
and similar ratings when presented in a longer list 
of 30 items to a group of 100 students in a pre- 
liminary study. To provide end anchors, 2 addi- 
tional items were added, one which was commonly 
rated as a very minor offense and the other as an 
extremely serious offense. Illustrations of offenses 
were as follows: “failure to return reserve books to 
the library on time,” “carving initials in desks,” and 
“feigning illness to postpone an examination.” 

Preliminary interaction. After presenting the task, 
the experimenter indicated that he had to return to , 
his desk in another room to retrieve some forms and 
that in his absence the group should begin their 
discussion. Mention was also made of the fact that 
discussion was being tape-recorded. The experi- 
menter returned after 5 minutes to administer a 
short questionnaire. Four items were designed to 
measure subjects’ reactions to the comparability 
orientation together with the initial interaction which 
transpired among members. 

When the comparability scale was completed, the 
experimenter told group members that they would 
have 13 more minutes in which to reach a consensus 
on the ranking of the 12 offenses. 

Mobile status hierarchy. All groups were then 
told that student discussions were being conducted 
because of the college Dean’s concern regarding 
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policies pertaining to student offenses and that the 
procedure required two different discussion groups 
to meet at the same time. Further instructions 
consisted of standard comments by the experimenter 
that individuals could be moved to the other group 
which had either superior or inferior status. In 
defining downward mobility, the experimenter men- 
tioned that “the optimal size of discussion groups 
poses a problem and [that] possibly some members 
will have to be moved either during or at the end 
of the initial discussion period." Relative to their 
own group, subjects were informed that "the other 
group has the unimportant, routine job of merely 
discussing the specific penalties" with the experi- 
menter's assistant and that "anyone could do the 
other group's job." To emphasize the inferior rank- 
ing of the other group, the experimenter stated that 
the Dean attached less significance to the other 
group's decisions and regarded the present group 
as having a more difficult and-challenging task of 
reaching a consensus on student offenses. 

In the instructions to establish upward mobility, 
the experimenter again posed the possibility of in- 
dividual members being moved to the other group, 
and in such case individuals would be joining a 
superior-ranking group. The experimenter described 
the other group as having “the really difficult and 
important job of deciding on the actual penalties to 
be assigned to each offense." In emphasizing the 
differential status of groups; the experimenter re- 
marked that his faculty supervisor was working 
with the other group and that the Dean would be 
consulting directly only with members of the other 
group for their views. The work of the present 
group was referred to as “a very simple, routine 
task of merely deciding how serious offenses are." 
The two mobility orientations were patterned after 
those employed by Kelley (1951). 

Confederate's role. During the subsequent dis- 
cussion period, the experimenter was absent from 
the room. The confederate member introduced 
standard comments to reinforce the experimenter's 
orientation on comparability among members and 
the prospect of being moved either up or down 
in the experimentally created hierarchy. The in- 
structions to confederates were to participate other- 
wise in a neutral and compliant manner. The presence 
of confederates was designed to strengthen the 
effectiveness of the operational conditions. 

Postdiscussion procedure. At the end of the dis- 
cussion period, the experimenter returned to the 
group and recorded the consensus ranking of student 
offenses. Each group was then shown two TAT cards 
and subjects were asked to write stories individually 
about each card. This administration was carried out 
for 15 minutes and was used to help disguise the 
experimenter’s immediate objectives in the investiga- 
tion. Following the 15-minute delay, subjects were 
provided with individual lists of the 12 offenses 
and required to submit their own private rankings 
of items. Finally, subjects completed a four-item 
questionnaire which was designed to assess individual 
reactions to the orientation on status mobility. They 
were also required to record again their feelings of 
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comparability. To terminate the procedure, the ex- 
perimenter explained the contrived nature of the 
operational conditions and the rationale of the 
study and cautioned subjects to refrain from dis- 
cussing the procedure with other students. 


Measure of Opinion Conformity 


An index of degree of opinion conformity was ob- 
tained by correlating subject’s private ranking of 
the 12 student offenses after the group discussion 
with the consensus ranking of the group. Kendall’s 
partial tau coefficient (as presented by Siegel, 1956) 
was computed, correction being made for an in- 
dividual’s ranking prior to the experimental ses- 
sion. The higher the correlation obtained between 
individual and group rankings, the more the sub- 
ject conformed to the collective opinion of his 
group. 


RESULTS 
Sociometric Data 
The operational procedures for establish- 


ing high and low comparability among group | 


members and upward or downward mobility 
were verified by subjects’ questionnaire re- 
sponses. Chi-square tests of the differences 
in median ratings for both sets of conditions 
were statistically significant at the .001 level. 
Both after a 5-minute preliminary discussion 
and at the end of the experimental session, 


subjects under the condition of high com- | 


parability regarded their group as more in- 
volved in the discussion and more congenial 
than those under an orientation of low com- 
parability. High comparable subjects also in- 
dicated a stronger interest in meeting again as 
a group. With the condition of downward 
mobility, subjects attached less significance to 


the work and were less favorably disposed | 


toward being moved to the “other” group 
than were members confronted with the con- 
dition of upward mobility. Subjects with up- 
ward orientation viewed the other group's 
function as requiring broader background an 
knowledge on the part of members. 

On the questionnaire relating to interper- 
sonal comparability, subjects’ ratings at the 
end were contrasted with those recorded neat 


the beginning of the experimental session. | 


The possibility existed that the experimenter 5 
instructions to establish hierarchical mobility 
and the confederate's follow-up reinforcement 
during the group's discussion would con- 


taminate the treatments for high and l0W 


comparability. For example, the effects of the 
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experimenter’s remarks on upward mobility 
could possibly have brought about high 
comparability as the discussion progressed 
whereas the experimental design required low 


‘comparability among members. No significant 


trend of this kind was evident in subjects’ 
ratings (x? = 2.80, df = 6, p> .20). There- 
fore the operational conditions for compar- 
ability and status mobility were sufficiently 
independent to meet the requirements of 
factorial design. 


Experimental Effects on Measures of Opinion 
Conformity 


Treatment of data, Partial tau coefficients 
for each subject (WV = 76) were given Fisher’s 
Z transformation to obtain normality of dis- 
tribution within subclasses. Higher final values 
represented higher degree of opinion con- 
formity (or lower degrees of nonconformity). 
These transformed values were subjected to a 
2x 2x5 analysis of variance. A summary 
is provided in Table 1. 

Differences in tau coefficients which were 
associated with the presence of different con- 
federate group members were statistically 
nonsignificant. As indicated in Table 1, this 
result applied to the main effects as well as 
the interaction effects involving confederates. 
Hence results for various confederates were 
combined. Means and standard deviations for 
combinations of interpersonal comparability 
and upward-downward mobility are given in 
Table 2. 

Findings. The major finding was a statis- 
tically significant interaction between inter- 
personal comparability and direction of status 
mobility (refer to Table 1). As shown in 


TABLE 1 


SUMMARY OF ANALYSIS OF VARIANCE OF MEASURES OF 
Private CONFORMITY To Group OPINION (TRANS- 
FORMED PARTIAL TAU COEFFICIENTS) 


Source df MS F 
Upward-downward mobility (A) | 1 | 153 | «100 
Comparability (B) i 1 | 58.30 | 596* 
Confederate (C) 4 | 18.25 | 186 
AXB 1 | 55.37 5.66* 
AXC 4 | 25.34 | 2.50 
BXC 4| 797 | «1.00 
AXBXC 4 | 12.35 1.26 
Error (w) 57 | 9.79 


*p «.025. 
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TABLE 2 


Means AND SDs OF MEASURES OF PRIVATE 
CONFORMITY TO GROUP OPINION 


Condition N M SD 
High comparability- 20 846 324 
downward mobility 
High comparability- 17 .685 270 
upward mobility 
Low comparability- 20 .513 376 
downward mobility 
Low comparability- 20 .690 295 


upward mobility 


Note.—Indices of opinion conformity are partial tau co- 
efficients converted to Fisher's Z. Higher mean values are asso- 
ciated with greater conformity (or less nonconformity) of in- 
dividual members to group consensus rankings, 


Table 2, opinion conformity was most evident 
among group members who were given an 
orientation of high comparability and down- 
ward mobility. The condition of low com- 
parability and downward mobility was as- 
sociated with the lowest degree of opinion 
conformity (or highest degree of noncon- 
formity—lowest mean value of transformed 
tau coefficients). Intermediate and similar 
values were obtained for groups with an 
orientation of upward mobility. Thus, opinion 
conformity was a function of interpersonal 
comparability when status was defined as 
mobile in a downward direction but not when 
the possibility of upward mobility was posed. 
This interrelationship supported the hypoth- 
esis and is illustrated in Figure 1. 

The main effect obtained with the dimen- 
sion of high and low comparability was 
statistically significant (see Table 1) whereas 
the main effect of upward-downward mobility 
was nonsignificant. These outcomes as seen in 
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COMPARABILITY H 
Fic. 1. The interrelated effect of interpersonal 
comparability and status mobility on individuals’ 
conformity to group opinion. 
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Figure 1 reflected the interaction effect and 
as such were given no further interpretation. 


Discussion 


The results were consistent with the view- 
point that opinion conformity or noncon- 
formity is instrumental in reducing anxiety 
which is associated with certain properties of 
group experience—in the present investiga- 
tion, high and low comparability combined 
with the threat of downward movement in a 
status hierarchy. Greater sensitization occurs 
with downward mobility and presumably 
emphasizes the need for collective identity 
with high comparability and the need for in- 
dividuated identity when low comparability 
exists among members. Thus, evidence for 
relatively high opinion conformity could be 
said to represent an attempt of individual 
members to protect themselves from demotion 
to another and inferior discussion group when 
their own group composition is high in com- 
parability. Evidence for relatively low con- 
formity of opinion also represented a pro- 
tective device against demotion but in groups 
wherein individuals experienced low inter- 
personal comparability. 

Opinion conformity or nonconformity under 
status threat as defined in the present study 
may be referred to as status-maintaining be- 
havior. In orienting subjects to the possibility 
of movement to an inferior-ranking group, the 
experimenter necessarily implied that present 
group members occupied temporarily a posi- 
tion of relative superiority in the status 
hierarchy. With high comparability, individ- 
ual members by showing private as well as 
public compliance with judgments of a group 
with relatively higher rank could reassure 
themselves in the stability of their own status 
ranking. In low comparable groups under 
status threat, individuals perceived each other 
as differing in disposition although the group 
had relatively superior status. Consequently, 
individuals in recording private opinions which 
deviated from group opinion validated the 
difference in disposition expected of them, and 
thereby made their identity with higher rank 
more secure. 

More generally, status maintenance is de- 
fined in terms of behavioral operations which 
the occupant of a hierarchical position em- 
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ploys to secure the interpersonal benefits and 
self-protections ordinarily accorded him by 
virtue of his ranking. Maintenance operations 
of a status personage comprise specialized 
behaviors which must conform to the expect- 
ancies shared by other persons located in the 
social system. Moreover, as noted in studies 
by Haber and Iverson (1965) and by Gold- 
berg and Iverson (1965), such operations 
must also allow position occupants to retain 
an acceptable private self-image. Thus, it be- 
comes understandable that public and private 
expressions of opinion, in the sense that both 
affirm status, may vary depending on the 
structural properties of the social relation- 
ships in which an individual’s status is de- 
fined. In terms of the present investigation, 
significant factors are member’s orientation 
to vertical movement in the social hierarchy 
and to their within-group comparability. 


Analysis of Content of Group Discussion 


In an attempt to verify the basis for dif- 
ferences in conformity behavior of high and 
low comparable members under threat of 
downward movement in status, transcripts of 
discussion in 15 of the 19 groups were ex- 
amined for differences in content. Transcripts 
for the remaining four groups (evenly dis- 
tributed across experimental conditions) were 
not available because of defective recording. 

The heightened anxiety which was postu- 
lated for downward mobility in comparison 
with upward mobile groups was represented 
by a variety of content measures. When 
threatened with downward movement, groups 
engaged in more discussion irrelevant to the 
assigned task (p< .20), volunteered fewer 
original ideas (p < .10), showed less attempt 
to formulate guide rules for rating ($ < 025); 
expressed more counternorm attitudes (P < 
.10), used a higher proportion of words sig- 
nifying interpersonal rejection (? < .20), and 
referred more frequently to high status per- 
sonages ( < .20).’ Although comparisons 0n 


7 Criteria for content categories were either a priori 
or derived from tag concepts employed by a com- 
puter system known as the General Inquirer. The 
analysis involved an objective count of either sen- 
tences or words, and consequently no test of rater 
reliability was required. For the dictionary for the 
General Inquirer system, see McPherson, Dunphy, 
Bales, Stone, and Ogilvie (1962). A description of 
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individual measures did not in most instances 

. meet conventional standards of statistical sig- 
nificance, the overall pattern was qualitatively 
consistent.5 The pattern, as such, suggested a 
‘group atmosphere wherein individuals were 
more predisposed toward either dependency or 
autistic reactions and toward either flight 
from the group or interpersonal assertiveness, 
depending on the comparability dimension 
of the group membership.” The more frequent 
references to high status personages implied 
that members were status conscious as a 
result of the threat of demotion. High com- 
parable groups provided more communication 
designed to alleviate interpersonal distress 
than did low comparable groups (p < .10). 
Discussions held by groups who were informed 
of the possibility of acquiring higher status, 
appeared generally to be more task oriented 
and to display less emotional-motivational 
content. The upward mobile group members 
were, thus, less sensitized to the compar- 
ability orientation provided by the experi- 
menter, and consequently were less disposed 
to modify their private opinions in either a 
more compliant or more nonconforming direc- 
tion. 

Under the conditions of the present in- 
vestigation, communication was directed to- 
ward members of one’s own group and the 
reference of subjects in expressing opinions 
was the group’s consensus. The sensitization 
effect of downward mobility, therefore, oper- 
ated when member's orientation was hori- 
zontal, that is, directed toward their own 
group. This result seems to overlap findings 
by Haber and Iverson (1965). Their groups 
communicated and stated opinions with refer- 
ence to persons outside the group (rather than 
within) and above or below them in a status 
hierarchy. Under the latter conditions, sensi- 
tization to ingroup relations is heightened by 
vertical outgroup positioning when behavioral 


the General Inquirer system and the theoretical 
basis of tag concepts are outlined in Stone, Bales, 
Namenworth, and Ogilvie (1962). 

8The p values are based on a restricted N of 15. 
The content of each group discussion was treated as 
an entity, making the sample N = 15 rather than 76. 

® Reference is made here to the defensive modes 
of group behavior when members are experiencing 
interpersonal stress. For a fuller discussion, see Bennis 
and Shepard (1956). 


389 


reference is ingroup. When behavioral refer- 
ence is outgroup, then sensitization to struc- 
tural properties of group experience is height- 
ened by horizontal ingroup relationships. The 
complementary nature of these sensitization 
effects is consistent with the contention that 
ingroup behavior is equilibrated with outgroup 
relationships and with the context in which 
a group is located (see Bales, 1955). 
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EFFECTS OF PROBABILITY OF REWARD ATTAINMENT 
ON RESPONSES TO FRUSTRATION * 


WALTER MISCHEL ax» JOHN C. MASTERS 
Stanford University 


How do expectancies for ultimately obtaining a blocked or delayed reward in 
a frustration situation affect the value of the reward? Children viewed a film 
which was interrupted near the climax on the pretext of a damaged fuse. 
The probability that the film could be resumed was either 1, .5, or 0. Measures 
of the film's value were administered before and after the interruption. There- 
after, the fuse was “fixed” and all Ss saw the remainder of the film, with 
final value ratings obtained at the end. The hypothesis that the nonavailability 
of a reward increases its value was supported. Ss who were given a 0 prob- 
ability for seeing the remainder of the film increased their evaluation of it 
more than those in the other groups, and this increase was maintained even 


after the entire film was shown. 


Frustration may be defined as an imposed 
delay of reward and is typically operational- 
ized by interrupting or blocking the organ- 
ism’s progress towards a valued goal. Various 
determinants of the ensuing responses have 
‘been investigated, with the speed and pres- 
sure of plunger-pushing frequently used as 
measures of the intensity ot responses to frus- 
tration in human studies (e.g., Ford, 1963; 
Haner & Brown, 1955). There is considerable 
evidence from these studies, as well as from 
animal investigations, that the expectancy of 
goal attainment, before the imposed delay of 
reward, affects the amplitude of responses to 
frustration monotonically (Amsel, 1958; 
Ferster, 1958). That is, when subjects with a 
high expectancy of goal attainment are frus- 
trated they respond more vigorously than 
those with lesser expectancies for reward. 

Surprisingly, a quite different aspect of ex- 
pectancy in the frustration situation has been 
relatively ignored. Namely, after the onset of 
frustration, how does the subject’s expectancy 
for ultimately obtaining the blocked reward 
affect his responses? Especially when the ob- 
jective probability for eventual goal attain- 
ment is ambiguous, subjective probabilities 
for obtaining the delayed goal may range 
from the definite anticipation of goal attain- 

1This study was supported by Research Grant 
M-06830 from the National Institutes of Health, 
United States Public Health Service. Grateful ac- 
knowledgment is due to the administrators and 


teachers of the Whisman School District who gen- 
erously cooperated in this research, 


ment to the expectation that the reward is lost 
irrevocably. It seems likely that such expec- 
tancies are potent determinants of responses 
to frustration and the systematic manipulation 
of these expectancies is the focus of the 
present study. Change in the evaluation of 
the blocked reward is the reaction to frustra- 
tion of main interest in this study. 

In this experiment children were subjected 
to a frustration, in the form of an externally 
produced delay-of-reward. The experimental 
treatments involved variations in the proba- 
bility that the frustration would be terminated 
and the blocked reward attained. More spe- 
cifically, elementary school children viewed an 
exciting motion picture film which was inter- 
rupted near the climax on the pretext ofa 
damaged electrical fuse. The experimentally 
presented probability that the fuse could be 
repaired and the film resumed was either 1, 
5, or 0. A control group viewed the film with- 
out interruption. Response measures of the 
perceived value of the film, and the children's 
delay of reward behavior with other goal ob- 
jects, were administered before and after the 
imposed delay period. Thereafter the fuse was 
“fixed” and the remainder of the movie was 
shown to all subjects, with another rating of 
its value and attractiveness obtained at the 
end. 

There is some suggestive indirect evidence 
that the nonavailability of a reward increases 
its attractiveness or value (e.g., Aronson & 
Carlsmith, 1963). However, the relationship 
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between expectancy or subjective probability 
and reward value or utility remains unclear. 
For example, Lewin, Dembo, Festinger, and 
Sears (1944) and Atkinson (1957) assume an 
` inverse relationship between subjective proba- 
bility and reward value, whereas Rotter 
(1954), Edwards (1954), and others argue 
for the independence of these constructs. 

In a cogent discussion of this issue, 
Feather (19592) has reasoned that, at least 
in our culture, persons learn to place greater 
value on the attainment of goal objects which 
are difficult to get because of the relatively 
consistent occurrence of sizable rewards for 
the successful achievement of difficult goals 
and deprecation or punishment for failure to 
attain easy goals. Moreover, achievement of 
the difficult is probably more typically and 
highly rewarded when it was due to the per- 
son's own efforts or skill rather than to 
chance factors beyond his control, and like- 
wise failure to achieve the easy is chastised 
more when it was due to the person's lack of 
skill than when it was a chance occurrence. In 
view of this, Fegther hypothesized that an 
inverse relationship. between attainment at- 
tractiveness (goal value) and success proba- 
bility would be more apparent in “ego-related” 
than in chance-related situations, and in 
achievement-oriented as opposed to relaxed 
conditions. Feather's (1959b) empirical re- 
sults supported an inverse relationship be- 
tween attainment attractiveness and success 
probability and indeed suggested that the in- 
dependence assumption may be an oversimpli- 
fication even for chance-related situations 
under achievement-oriented conditions. Simi- 
larly, Atkinson (1957) suggests that 


the incentive values of winning qua winning, and 
losing qua losing, presumably developed in achieve- 
ment activities early in life, generalize to the gam- 
bling situation in which winning is really mot con- 
tingent upon one’s own skill and competence [pp. 
370-3711. 


In the present experiment, it was reasoned 
that the learned inverse association between 
reward value and attainment probability in 
achievement-related situations generalizes to 
non-achievement-related frustration conditions 
in which goal attainment is not in the sub- 
ject’s control and is not contingent on his 
behavior. If in our culture persons acquire the 
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generalized expectation that unlikely or un- 
available positive outcomes are more valuable 
than likely or assured positive outcomes, then 
the value ascribed to an unattainable reward 
should be greater than that attributed to a 
reward that either may be attainable or whose 
attainment is assured. Accordingly, it was 
predicted that the perceived value of the de- 
layed reward (film) would be greater when it 
is ultimately unattainable (P = 0) than when 
its attainment is assured (P = 1). Moreover, 
it was anticipated that certainty of reward 
attainment minimizes the effects of the im- 
posed delay or frustration and therefore no 
differences were expected between the P — 1 
treatment and the control group. Likewise, it 
was anticipated that the perceived value of 
the reward would be greater in the P — 0 con- 
dition than in the P —.5 group and that sub- 
jects in the latter would value the reward 
more than those in the P — 1 treatment or 
the control group. A posttest was included to 
determine whether differences between treat- 
ments in the perceived value of the delayed 
reward are maintained even after the frustra- 
tion is terminated and the delayed reward is 
obtained. 

It also seemed plausible that when the per- 
ceived frustration is greater subjects will more 
frequently self-administer other available im- 
mediate rewards, Therefore, when the delayed 
reward is permanently unattainable, immedi- 
ate self-reward (in the form of increased pref- 
erence for immediate smaller as opposed to 
delayed larger rewards) should be. greater 
than when it is ultimately attainable (P =0 
> P = 1). This was not hypothesized on the 
basis of any “compensatory mechanisms,” but 
on the assumption that individuals in our 
culture learn that immediate self-reward is 
more acceptable following strong frustration 
than following minimal frustration. 


METHOD 
Subjects 


The subjects were 56 boys and 24 girls, all sixth- 
grade students at two public schools in Mountain 


View, California. 


Design and Procedure 


Preexperimental assessment of delay-of-reward 
responses. In a preexperimental session the children 
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were administered in their classroom groups a series 
of 14 paired rewards, in each of which they were 
asked to select either a small reward that could be 
obtained immediately, or a more valued item con- 
tingent on a delay period ranging from 1 to 4 weeks. 
The group administration (Mischel & Gilligan, 1964) 
proceeded in the following manner. Children were 
provided individual booklets containing on each page 
a brief description of a given set of paired objects 
and the associated time interval. After the experi- 
menter had displayed both rewards and explained 
the temporal contingency, the children were in- 
structed to record their choice and to turn the page 
in preparation for the next set of items. The sub- 
jects were also advised to choose carefully and 
realistically because in one of the choices they would 
actually receive the item they selected, either on the 
same day or after the prescribed delay period, de- 
pending upon their recorded preference. This promise 
was indeed kept. 

Half of the sets of paired rewards involved small 
amounts of money (eg, $25 today, or $35 in 1 
week), while the remaining items included edibles 
(e.g., small bag of salted peanuts today, or a can of 
mixed nuts in 2 weeks), children's magazines, and 
various play materials (e.g., small rubber ball today, 
or a large rubber ball in 2 weeks). 

Assignment to treatments. The total pool of sub- 
jects was divided into quartiles on the basis of the 
distribution of delay-of-reward responses. An equal 
number of children from each quartile was randomly 
assigned to each of the three experimental groups 
and the control group, thus producing groups similar 
in their initial willingness to defer immediate re- 
wards for the sake of delayed, larger gratifications. 
The same proportion of boys and girls was assigned 
to each group, 

At each of the two schools each of the four ex- 
perimental conditions was administered once, with 
an approximately equal number of children from 
each school in each condition. Two new experi- 
menters, unconnected with the preexperimental ses- 
sion, were used and each administered one half the 
treatments in one school and the other half in the 
second school. The temporal sequence of treatments 
at the two schools was balanced, the sequence in the 
second school being the reverse of the one in the first 
school. 

Experimental treatments. Approximately 4 weeks 
after the assessment of delay-of-reward responses, 
the experimental sessions were conducted in a re- 
search trailer stationed at the school. From 8 to 12 
subjects participated in each session. The experi- 
menter was introduced as coming from “Deluxe 
Movie Studios” to prescreen an “exciting space 
movie” and to obtain the children’s opinions about 
it. The film was a 20-minute documentary on space 
exploration, The children were told that they would 
fill out “audience estimate and opinion sheets” sev- 
eral times to determine their “feelings at different 
points.” These sheets contained the value ratings 
described below. The experimenter explained that he 
was also interested in children’s expectations about 
how attractive the film would be and therefore they 
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would be asked to rate it before it actually com- 
menced, This rationale was used to obtain a base 
level of attractiveness ratings in all groups to serve 
as a comparison point for any subsequent changes in 
the rated value of the film. 

In the experimental groups the movie began as 
soon as the first value ratings were completed and 
collected. After 5 minutes, at a predetermined cli- 
mactic point in the film (just as the space ship was 
being launched) the projector failed. A confederate, 
posing as the “district electrician,” entered and ex- 
plained that the power failure was due to his over- 
loading the circuits with electrical tools. This ra- 
tionale was used to avoid connecting the cause of 
frustration with either the experimenter and his pro- 
cedure or with the subjects’ own behavior. The ex- 
perimental treatments consisted of the following 
variations in the probability of resuming the inter- 
rupted film, announced by the confederate: 


[P —1]: Pm positive I can fix it—I've had 
things like this happen in the past, and I've al- 
ways managed to fix them. 

[P =.5]: Pve had things like this happen before 
—sometimes I was able to fix them, sometimes I 
wasn't. I never know for sure . .. there's prob- 
ably about a 50-50 chance. 

[P — 0]: It takes a special fuse for this circuit, 
and there are none around here , . . I can't pos- 
sibly fix it . . . there's no chance. 


To increase credibility, the experimenter asked if 
the confederate was sure of his evaluation and the 
confederate reiterated his initial statement confi- 
dently in paraphrased form and left. The experi- 
menter expressed his regret at the interruption but 
reminded the subjects that repeated “audience esti- 
mate” sheets were needed and circulated the second 
set of value ratings. 

In the control group the movie was not inter- 
rupted and both sets of ratings were obtained before 
the movie began. The second set of ratings was 
administered approximately 5 minutes after the first 
set, and during the intervening period the experi- 
menter prepared the film and projector. The ra- 
tionale given to the children for readministering the 
ratings was in terms of the need for repeated meas- 
urements of their feelings at different times. 

Following these second value ratings the children 
in all groups were administered a measure of im- 
mediate or delayed self-reward (described below). 

Approximately 10 minutes after his first entry, ant 
after the second ratings of the film and the delay-of- 
reward measures were completed, the electrician Ye 
turned to all experimental groups, announcing that 
he had been able to fix the fuse and the movie 
would continue. The remainder of the film was 
shown and, when it ended, ratings of the movie were 
obtained for the third and final time. Table 1 sum- 
marizes the design and measures for all conditions. 

Assessment of reward value. A four-item measure 
of reward or goal value was administered in printed 
booklets, the subject checking his response to each 
item on a 7-point intensity scale. The items were: 
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ie ; TABLE 1 
SUMMARY OF THE EXPERIMENTAL DESIGN 
“Experimental | Phase t: Hur ; TOUS FI Fil Phase 3: Third 
groups BELLE Ex im Experimental treatments LEER pad E USE eae Tce film 
delay measure? 
I Same Film interrupted; — Same Same Same 
(N = 20) P =1 for resumption 
II Same Film interrupted ; Same Same Same 
(N = 20) P = .5 for resumption 
II Same ` Film interrupted; Same Same Same 
(N = 20) P = 0 for resumption 
IV Same No interruption; Same Same Same 
(N = 20) film not yet begun 


a Following this, film begins in all groups except IV. 
» Following this, film begins in Group IV. 


(a) Let’s pretend that you are the manager of a 
movie theatre. Instead of charging a set price for 
letting people see the movies you have, you do 
things differently—when you have a movie that 
you think is very good, you charge a lot for them 
to get in; when you think a movie is bad, you 
don’t charge very much at all. Now let’s pretend 
that the movie I have today is at your theatre— 
check about how much you would charge for 
tickets [0 to $1.00f. 

(b) Let's pretend that you have a month's al- 
lowance of $1.00 to spend on movies. Suppose the 
movie I have with me today is showing at a 
theatre you usually go to. Remember, you have 
$1.00 to pay for movies for a whole month, and 
if you go to see a show, part of your money will 
be gone. Now, let's also pretend that they don't 
have the regular prices at the theatre for tbis 
movie, but they have special ones—and you don't 
know exactly how much they are. How much 
would you pay [0 to $1.00], out of your dollar, 
to see the movie I have here today if it were at 
that theatre? 

(c) We all go to see movies. Let's compare the 
movie I have here with some of the movies you 
have seen in regular theatres. Check above the 
words [from “much better" to “much worse" 
that tell how well you feel this movie compares 
[or will compare] with others. 

(d) We are interested in how good you think the 
movie is going to be [or is]. You are to put a 
check mark above the words [from “really good” 
to "awful"] which say how good you feel the 
movie is [or will be]. 


The above measures were given in the same ran- 
dom sequence to all subjects, with a different sequence 
used in each of the three administrations. 

Assessment of changes in delay behavior. Immedi- 
ately following the second set of ratings the chil- 
dren were administered a new series of 14 paired 
choices between immediately available smaller re- 


wards and delayed larger rewards. The rewarding 


objects differed from those employed in the pre- 
experimental sessions, but the money items were the 
same since pretesting revealed that subjects were 
unable to recall the exact amounts and temporal 
intervals involved. The experimenter indicated at 
the outset that these choices were unconnected with 
his own project and were being administered for a 
Stanford researcher during the available period in 
order to save time.? © 


RESULTS 


Prior to the experimental manipulations the 
groups did not differ appreciably in their 
initial evaluation of the film. The first (Phase 
1) mean value ratings of the movie in the 
control, P = 1, P —.5, and P = 0 groups, re- 
spectively, were 16.85, 16.90, 16.75, and 
16.55. Figure 1 shows the mean value rating 
of the film at each of the experimental phases 
for each group. 

To assess the effect of the independent 
variable, change scores were computed for the 
difference between each pair of value ratings 
in each group. Analysis of variance of the 
mean change in value ratings immediately 
after interruption of the film (difference be- 
tween Phase 2 and Phase 1 ratings) revealed 

2 At the end of each experimental session the chil- 
dren were informed of the importance of not com- 
municating with others about the experiment be- 
cause it would "spoil things for us and take the fun 
out of it for the other children.” All subjects agreed 
to this and were given the rewards they had been 
promised, either at this time or after the specified 
delay period, depending on the choice they had 
made. Informal postexperimental interviews indicated 
that there had been no communication about the 
particulars of the experiment. 
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MEAN VALUE OF FILM 


POST- 
FILM 


PRE- POST- 
TEST PROBABILITY 


PHASES OF THE EXPERIMENT 


Fro. 1. Mean value of film at each phase of the 
experiment. 


a significant effect (F = 5.91, df = 3/76, p 
<.005). Table 2 summarizes the results of £ 
tests for differences between groups in mean 
value change. It is evident (Row 1 of Table 
2) that subjects who were told they would 
definitely not see the remainder of the film 
(P — 0) increased their evaluation of it sig- 
nificantly more than those in all other condi- 
tions. None of the other groups (P = 1, P= 
.5, control) differed from each other. 
Moreover, as Figure 1 indicates, the in- 
creased evaluations of the film in the P = 0 
condition was maintained even after the inter- 


8 As a partial check on the internal consistency of 
the four items measuring the value of the film, the 
effects of the independent variable were examined 
separately for the first two items combined and the 
last two items combined. Since the results revealed 
highly similar trends the final analyses were based 
on all four items combined. 


WALTER MISCHEL AND JOHN: C. MASTERS 


ruption was terminated and the movie com- 
pleted. Analysis of variance of changes in 
value ratings from Phase 2 to Phase 3 
indicated no significant differences between 
groups (F < 1). However, analysis of vari- 
ance of change from the first preexperimental 
value measure to the terminal rating after 
completion of the film indicated significant 
treatment effects (F = 2.74, df — 3/76, p 
< .05). The between-group comparisons of 
these change scores (Table 2) show that sub- 
jects in the P = 0 condition tended to main- 
tain their overevaluation of the movie even 
after they viewed the entire film, although 
the difference between the overall increment 
in the P — 0 group and the P = 1 condition 
falls short of acceptable significance. 

The data clearly showed an increase in the 
evaluation of an unattainable goal but there 
was no evidence for a linear inverse relation- 
ship between the probability of attaining a 
goal and the value attributed to it. Value 
changes in the P=.5 condition were not 
significantly different from those in the P = 1 
or control groups. Indeed, the mean terminal 
value rating was slightly (not significantly) 
higher in the P — 1 treatment than in the 
P = .5 condition. 

The effects of treatments on changes in 
delay-of-reward behavior were examined by 
comparing the number of preexperimental and 
postexperimental immediate reward choices 
made in each condition. The control, P = 1, 
and P — 0 groups showed mean increases in 
immediate reward choices of 1.1, 1.5, and 1.3, 
respectively, whereas in the P = .5 condition 
there was a fractional decrease (—.3). Analy- 
sis of variance of these data yielded no 
significant effect (F« 1). 


TABLE 2 
BrrWEEN-GROUP COMPARISONS OF MEAN CHANGES IN VALUE OF FILM 


P=0 | P=0 | P=0 | P=5 | P=5 | Pat 
Mean change between eee me | coe | oa 
t t t t Eo 
1. Phase 1 and 2 (Preprobability to post- 
probability) 3.88 | 3.12% | 2.949 | 18 94 16 
2. Phase 1 and 3 (Preprobability to postfilm) 2.69* 1.70 2.20* 49 49 98 


—————— ——————————— — 
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DISCUSSION 


The results show that the value of a 
blocked or delayed reward can be affected by 
the expectancy for its ultimate attainment. 
When the probability for seeing the inter- 
rupted film was stated as 0, its rated value 
increased significantly more than when it was 
1 or .5. When resumption of the film was 
presented as a certainty, children evaluated 
the film no differently than those who saw 
it without interruption. 

Although the value of the film increased 
most when its completion seemed nonattain- 
able, and this change was maintained even 
after the film was completed, significant in- 
creases did not occur when an intermediate 
probability (P=.5) was given. Subjects 
given an intermediate probability did not dif- 
fer from those in the P = 1 and control con- 
ditions. An extension of the present study, 
using a large number of probability levels, 
would clarify whether the obtained effect of 
probability on value is restricted to unattain- 
able outcomes (P= 0) or holds for highly 
unlikely outcomes (e.g., P = .10). 

Since subjective probability does not nec- 
essarily match experimentally presented prob- 
ability statements, it would also be interest- 
ing to assess the subject’s expectancies for 
attaining the delayed reward and to examine 
the relationship between subjective probabil- 
ity and reward value in the present situation. 
For example, in the P = .5 treatment, most 
children may have had subjective expectancies 
for seeing the film which exceeded .5 consider- 
ably. Such effects may account for the lack 
of difference obtained between the P — 1 and 
P = .5 manipulations. 

The present results indicate that in our 
culture unattainable positive outcomes may 
be more valued than those which are attain- 


able and that the unavailability of a positive 


Outcome enhances its perceived desirability. 
Moreover, the findings support the view that 
the higher value attributed to unlikely out- 
comes in achievement-related situations gen- 
eralizes very broadly even to non-achievement- 
related situations in which the probability 
for goal attainment is clearly independent of 
difficulty level and in which goal attainment 
is entirely outside the subject’s own control. 
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These results have direct implications for 
understanding responses to frustration. If the 
nonattainability of a reward increases its 
desirability, persons who, on the basis of their 
previous histories, expect that delayed or 
blocked rewards are lost irrevocably will re- 
spond quite differently from those who antici- 
pate their ultimate attainment. The individual 
who has learned that blocked goals tend to 
be unattainable may remain on the unhappy 
treadmill, expecting that what he wants can- 
not be obtained and overevaluating and want- 
ing what he cannot have. Certainly this is not 
an unfamiliar clinical phenomenon, In con- 
trast, the person who has learned that frus- 
trated goals ultimately tend to become avail- 
able may respond to delay of reward with 
equanimity. 

The treatments did not significantly affect 
self-reward behavior in the form of changes 
in willingness to defer immediate rewards for 
the sake of larger but delayed outcomes. Un- 
expectedly, there was a slight, nonsignificant 
trend towards increased delay behavior in the 
P = .5 condition, whereas in all other treat- 
ments subjects tended to increasingly choose 
immediate rewards. It may be speculated that 
when the ultimate attainment of the blocked 
reward is uncertain children try to be espe- 
cially *good," deferring immediate gratifica- 
tion in the irrational hope that their “good” 
behavior will increase the probability of ob- 
taining the blocked goal. This is sheer 
speculation but may be interesting to pursue. 

The present experiment was designed to 
minimize the occurrence of cognitive dis- 
sonance at the onset of frustration and there- 
fore the frustration was deliberately uncon- 
nected with the subject's own behavior and 
not contingent on his own decisions, In con- 
trast, when a temporary or permanent delay 
of reward is a consequence of the subject's 
own behavior, dissonance theory (Festinger, 
1957) might generate predictions opposite to 
those of the present study. In a recent study 
by Carlsmith (1962), subjects were exposed 
to the possibility of electric shock with varia- 
tions in the probability that they would 
actually receive the shock. The probability 
was ostensibly determined by the subject's 
performance on a fictitious personality test. 
Carlsmith’s prediction of an increase in the 
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rated “pleasantness” of the shock as a func- 
tion of the probability of receiving it was 
supported and is consistent with dissonance 
theory. Comparisons between the present 
study and the Carlsmith experiment are pre- 
carious because they differ in several critical 
ways; for example, the latter was not a frus- 
tration paradigm, involved aversive rather 
than positive outcomes, and presented proba- 
bilities ostensibly determined by the subject’s 
own performance. 

In a very recent study by Turner and 
Wright (1965) children rated the attractive- 
ness of toys before and after a number of 
experimental manipulations. In one condition 
they were informed that they could “never” 
play with one of the toys they had just rated 
and in another that they could play with it 
“ater.” The rated value of the toy decreased 
in the “never” condition and increased in 
the “later” condition. These findings appear 
inconsistent with the present results but 
again there are numerous problems that pre- 
vent clear comparisons, For example, in the 
Turner and Wright study the children may 
have interpreted the “never” condition as the 
punishing consequence of their own rating 
behavior. Perhaps even more important, in 
both the Carlsmith (1962) and the Turner 
and Wright (1965) studies, the postmanipu- 
lation ratings were obtained after brief tem- 
poral delays (15 minutes and 5 minutes, 
respectively) whereas in the present experi- 
ment the reevaluation occurred almost im- 
mediately after the announced probabilities. 
It may be that such temporal effects are 
critical determinants of the relationship be- 
tween expectancy and value. The initial re- 
sponse to an unattainable positive goal may 
be to overevaluate it (as in this study) but 
after being faced with its unattainability for 
some time justification processes commence 
and the value of the reward becomes mini- 
mized. It would be interesting to test for 
a positive relationship between probability of 
goal attainment and reward value under 
dissonance-producing conditions (in which the 
frustration is the consequence of the subject’s 
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own behavior) using an extension of the 
present experimental design, and including a 
temporal variation in the amount of time 
elapsing between the onset of frustration and 
the measurement of reward value. 
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SOME DETERMINANTS OF INTERPERSONAL 
EVALUATING BEHAVIOR * 
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How do the evaluations 1 person, P, makes of his own acts, in conjunction with 
his knowledge of the evaluations another person, O, makes of the same acts, 
affect P's ensuing evaluations of O's acts? This question was analyzed from the 
points of view of Heider’s balance theory and a more straightforward reciproca- 
tion theory. An experimental situation in which groups of 4 Ss continually ex- 
changed evaluations of one another's artistic judgments revealed that an S, in 
sending evaluations to others, tended to adopt a reciprocating strategy which 
was interpreted as an effort to maximize the rewards he received from others. 
The fact that this tendency was observed for low self-evaluators as well as 
high self-evaluators suggested that there may be certain limitations to a balance 
theoretical interpretation of the evaluative process. 


Interpersonal evaluations are the implicit 
or explicit expressions of positive or nega- 
tive value accorded by one person (P) to 
either the specific actions or the general char- 
acteristics of another person (O). In every- 
day social interaction, these evaluations are 
communicated by „direct verbal expressions of 
worth, or of agreement and disagreement, 
or by more subtle means of expression, such 
as attentiveness. The analysis of the evalu- 
ating process which follows will contrast two 
theoretical points of view—balance theory 
and reciprocation theory. When the implica- 
tions of these theories are examined, they 
lead, in some instances, to opposite predic- 
tions about how P's self-evaluations, in 
conjunction with his knowledge of the evalua- 
tions he receives from O, affect the evaluations 
he sends to O. 

Heider’s (1946, 1958) balance theory con- 
siders the way a person’s evaluations of other 
people, objects, or events will be influenced 
by his perception of the structural or evalua- 


1This article is based on a doctoral dissertation 
submitted to the faculty of the University of Penn- 
sylvania. I wish to thank Malcolm G. Preston, 
Albert Pepitone, and David R. Williams, the mem- 
bers of my dissertation committee, and also Alex 
Bavelas and Albert Hastorf of Stanford University 
for their helpful suggestions and criticisms. During 
the tenure of this investigation the author was Sup- 
ported by Predoctoral Fellowship MH-15,957 from 
the National Institute of Mental Health, United 
States Public Health Service. 

2Now at the State University of New York at 
Buffalo. 


tive relations which exist in any triad consist- 
ing of himself (P), another person (O), and 
some object or event (X). These relations 
may be given signs indicating whether they 
are positive or negative. A balanced cogni- 
tive state exists among the elements in a 
triad whenever the product of the three signs 
is positive, and an unbalanced cognitive state 
exists in the triad whenever the product of 
the three signs is negative. The theory pre- 
supposes that P will maintain a balanced 
state and avoid or change an unbalanced state 
with respect to any given triad. Deutsch 
and Solomon (1959) have applied the bal- 
ance theory to interpersonal evaluations and 
have examined the role a person's self- 
evaluation plays in his evaluation of others. 
In this analysis the object, X, becomes the 
self or some attribute of the self. It was 
hypothesized that if P and O make similar 
evaluations of some aspect of P, then P will 
positively evaluate O, but if P and O make 
dissimilar evaluations of some aspect of P, 
then P will negatively evaluate O. An experi- 
ment designed to test this hypothesis showed 
that, although other motivations also af- 
fected subjects’ general impressions of one 
another, the derivations from balance theory 
were supported by the data. 

The Deutsch and Solomon experiment 
focused on subjects’ general impressions of 
one another (e.g., “desirability as a team- 
mate”) and was conducted in the context 
of a social situation where P’s evaluations of 
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TABLE 1 


Triaps OF EVALUATIONS AND THEIR IMPLI- 
CATIONS FOR BALANCE AND RECIPROCATION 


Support for: 
PP OP PO 
(ty (O) e Balance | Reciprocation 
(4) (5) 
T + + Yes Yes 
ad P = No No 
un = + No No 
a = ss Yes Yes 
= + kp No Yes 
Et E = Yes No 
T ens i Yes No 
im = =) No Yes 


O were communicated only to the experi- 
menter and P neither anticipated nor re- 
ceived further evaluations from O. Can bal- 
ance theory also account for interpersonal 
evaluations which are directed toward the 
specific actions of another person and which 
are sent in the context of a social situation 
where further evaluative feedback is antici- 
pated and occurs? Such a situation was 
created in the present experiment. Subjects 
exchanged evaluations of one another’s task- 
oriented activities and each exchange was 
part of a continuing process, Specifically, P 
evaluated every act that O performed (desig- 
nated PO), and O evaluated every act that 
P performed (OP). In addition, each person 
evaluated his own acts (PP). Every time 
P and O exchanged evaluations, a triad oc- 
curred which contained three evaluations 
relevant to the cognitive system of any per- 
son P (ie, PP, OP, and PO). Since the 
limitation was imposed that each of the 
three evaluations must be either positive or 
negative, eight possible triadic combinations 
were possible. These combinations and 
their consequences for balance theory are 
presented in Columns 1-4 of Table 1. 

In the course of collecting and analyzing 
the data from such an experiment it became 
evident that a more straightforward recipro- 
cation theory would also have to be consid- 
ered. This theory assumes the existence of a 
need for esteem and approval from other 
people. If P’s goal in an interaction with O is 
to gain the latter’s approval, then, following 
the frustration-aggression hypothesis (Dol- 
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lard, Doob, Miller, Mowrer, & Sears, 1939), 
one would expect P to aggress toward O to 
the extent that he receives unfavorable evalu- 
ations from O and his goal of receiving re- 
warding information is thereby frustrated. 
It is also possible to argue that in a situation 
of continued interaction P might attempt to 
“shape” O’s evaluative behavior by positively 
reinforcing O’s expressions of approval and 
negatively reinforcing O’s expressions of dis- 
approval. Whether P’s reactions are primarily 
emotional responses as suggested by the 
frustration-aggression hypothesis or instru- 
mental responses as suggested by the “shap- 
ing” hypothesis, the reciprocation theory pre- 
dicts that people will return rewarding in- 
formation to others in exchange for reward- 
ing information about themselves and will 
return punishing information to others in 
exchange for punishing information about 
themselves. When the last two signs of the 
triads in Table 1 are considered, reciproca- 
tion occurs if and only if the signs are the 
same. It will be noted that the balance and 
reciprocation theories lead to the same pre- 
dictions whenever a person makes a favorable 
evaluation of himself and to opposite pre- 
dictions whenever a person makes an un- 
favorable evaluation of himself. 

In an experiment to compare these theories, 
independent manipulations were designed to 
establish four experimental conditions: high 
self-evaluators who received positive evalua- 
tions from others, high self-evaluators who 
received negative evaluations from others, 
low self-evaluators who received positive 
evaluations from others, and low self-evalu- 
ators who received negative evaluations from 
others. In addition, P's perception of the 
competence of O was systematically varied, 
and the nature of the experimental procedure 
was such that the effects of the order in which 
O responded and was evaluated relative to B 
could also be explored. 


METHOD 
Subjects 


Male students (N = 128), enrolled as freshmen 1n 
either the Wharton School or the College of the 
University of Pennsylvania, were assigned randomly 
to one of four experimental conditions and were 
run four at a time in experimental sessions Jasting 
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approximately 1.5 hours. Each subject was paid 
$3, and participation was on a volunteer basis. 


Apparatus 


Subjects were seated around four sides of a 5x 
$ foot table, and two upright partitions, placed 
diagonally across the table, visually isolated them 
from one another. In front of every subject was a 
panel with three columns corresponding to the other 
three group members. Each column contained one 
green light and one red light for receiving positive 
and negative evaluations, respectively, and one 
double-throw switch for sending positive and nega- 
tive evaluations, On the right-hand side of the 
panel was a fourth double-throw switch with 
which a subject evaluated his own responses. In a 
room adjoining the experimental room was a master 
control panel containing 32 Mercury counters for 
recording subjects’ switch-throwing responses, and 
12 double-throw switches for turning on any one 
of the 24 lights on the subjects’ panels. The master 
control panel allowed the operator to record the 
number of positive and negative evaluations made 
by each subject, and to “feed back” any desired 
program of red and green lights. 


Procedure 


After assigning subjects randomly to the four 
seats around the table, the experimenter explained 
that the purpose of the sessions was to increase the 
effectiveness of the Graves’ (1948) Design Judge- 
ment Test, first, by assessing not only a person’s 
choice among the two or three designs presented in 
each item but also the reason for his choice, and 
second, by including peer group as well as “expert” 
evaluations of a person's answers. The subjects 
were told that in order to make a study of such 
test improvements possible, they were being asked 
to take both “written” and “oral” parts of the 
test. 

In the written part of the test, subjects wrote 
answers to Items 1-20 on sheets which were then 
scored by “experts.” Subsequently, each subject was 
informed of his score and the other members’ scores. 
These scores were experimentally controlled and 
constituted the major manipulation of the self- 
evaluation and the competence of the other group 
members, as follows: 

1, Self-evaluation manipulation. Two subjects in 
each group of four were told that they had an- 
swered correctly 18 of 20 items, placing them at 
the eighty-eighth percentile (high self-evaluation con- 
dition, HS); and the other two subjects were told 
that they had answered correctly only 8 of the 20 
items, placing them at the twenty-fourth percentile 
(low self-evaluation condition, LS). 

2. Manipulation of the perception of others’ 
competence, Each subject was informed that one 
other group member answered correctly 18 items 
(high competence member, HC), that one other 
member answered correctly 13 items (medium com- 
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petence member, MC), and that a third member 
answered correctly 8 items (low competence mem- 
ber, LC). 

In the oral part of the test each subject indicated 
aloud both the design he preferred and a reason for 
his preference on 16 additional test items. The 
group of four subjects who were labeled persons 
“wW «X? «V and “Z,” responded continuously 
in alphabetical order, and a "trial? is defined by 
each subject giving a separate test answer aloud, 
making an evaluation of his own answer, then 
receiving red-light or green-light evaluations pur- 
portedly from each other group member. Subjects 
were informed that the evaluations they gave their 
own answers were private responses, but that the 
evaluations they sent and received from others were 
public knowledge because the lights on the panels 
were interconnected. 

During the oral part of the test additional evalua- 
tive information was sent which was designed to 
strengthen the initial manipulations of the percep- 
tion of others competence and the selí-evaluation. 
Supplemental manipulation of the perception of 
others’ competence: On all 16 trials subjects ob- 
served red-light and green-light evaluative feed- 
back about the other members’ answers. This feed- 
back was so controlled as to supplement the initial 
competence manipulation; each subject could ob- 
serve (on his own panel) feedback purportedly from 
other group members which indicated that the HC 
member was receiving 88% green lights, that the 
MC member was receiving 63% green lights, and 
that the LC member was receiving 38% green 
lights. Supplemental self-evaluation manipulation: 
During the first four trials a subject did not receive 
the group’s red-light and green-light evaluations of 
his own answers. Instead he received normative 
evaluations from the experimenter which were de- 
signed to supplement the initial self-evaluation 
manipulation; subjects in the HS condition re- 
ceived high “norm” scores from the experimenter, 
and subjects in the LS condition received low 
“norm” scores from the experimenter. 

3. Manipulation of the evaluations received from 
others. On Trials 5-16 subjects received evaluations 
purportedly from the other group members, but 
actually controlled by an experimenter’s assistant 
and programed as follows: Two subjects in each 
group of four received 10 green lights and 2 red 
lights from each of the other group members (high 
other evaluation condition, HO) ; and the other two 
subjects received 4 green lights and 8 red lights 
from each of the other group members (low other 
evaluation condition, LO). In both the HO and LO 
conditions the distribution of green and red lights 
received over time was random. 

The HS and LS conditions were systematically 
varied with the HO and LO conditions, and all 
subjects, regardless of the experimental conditions, 
were presented with information about the compe- 
tence of the other three group members (HC, MC, 
and LC). The three manipulations constitute a 


2X2X3 experimental design. 
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F TABLE 2 
MEAN COMPETENCE RANKINGS OF THE SELF 


H Change i 
eonsition Giant à) Chal 16) fme 
HS-HO 1.58 1.23 +0.35* 
HS-LO 1.55 2.32 —0.7T** 
LS-HO 3.03 1.77 -F1.26** 
LS-LO 3.13 3.26 —0.13 


Note.—A low number indicates a high rank. 
*5 <.01, U test. 
** p <.001, U test. 


Overview of the procedure. Thirty-two groups of 
four subjects each were told that, to make possible 
the study of certain improvements in a particular 
design judgment test, they would be given written 
and oral parts of the test. Each subject was induced 
to adopt a high or low evaluation of his own judg- 
ment ability by giving him false high or low scores 
on the written part of the test and false high or 
low normative feedback on the first four items he 
answered during the oral part of the test. He was 
also induced to perceive a competence hierarchy 
among the other members of his group by being 
given information about how well each of his col- 
leagues had done on the written part of the test 
and how each member was being evaluated by the 
rest of the group throughout the 16 trials. On 
Trials 5-16 each subject was given either positive 
or negative feedback which purportedly represented 
the evaluations of the other members of his group. 
In addition to being either positive or negative, 
this feedback was systematically varied so as to be 
either consistent or inconsistent with what the 
subject was led to believe about his own judgment 
ability. At the conclusion of the experiment, sub- 
jects filled out questionnaires, then the true purpose 
of the experiment was explained, further questions 
and comments solicited, and the importance of 
maintaining secrecy about the manipulations was 
emphasized. 


Dependent Variables 


The major dependent variables were the positive 
or negative evaluations that a subject sent to his 
colleagues, and the high or average? evaluations 
he gave himself on each of the 16 trials during the 


3 In a series of pilot experiments in which subjects 
were asked to evaluate their own responses either 
positively or negatively, such a large proportion 
(29.5%) of subjects never gave themselves negative 
evaluations on Trials 5-16 that the scale was changed 
from positive or negative to high or average. While 
it is true that an “average” evaluation may not 
represent a negative sentiment, balance principles 
are assumed to apply to relative distinctions be- 
tween evaluations as well as to absolute distinctions. 
Further experimental work may show that this as- 
sumption, under certain conditions or for some 
individuals, is questionable. 


STEPHEN C. JONES 


oral part of the test. Paper-and-pencil data were 
obtained from subjects at various points in the 
experimental procedure. Each subject was asked 
after Trial 4 and again after Trial 16 to rank the 
four group members (including himself) on the 
basis of competence. A questionnaire administered 
after Trial 16 required a subject to indicate, first, 
the extent of his judgment ability and, second, 
whether the “expert” evaluations of the written 
part of the test or the peer evaluations of the oral 
part of the test was the more valid indicator of a 
person's ability. 


RESULTS 


In the analyses which follow, 4 of the 128 
subjects run in the experiment have been 
excluded because they expressed suspicion 
that the evaluative information they had re- 
ceived was controlled by the experimenter. 
Fortunately, these deletions were evenly dis- 
tributed, leaving 31 subjects in each experi- 
mental condition.* 


Consequences of the Experimental Manipula- 
tions 


As shown in Table 2, after Trial 4 sub- 
jects in the HS conditions' ranked themselves 
higher in competence than those in the LS 
conditions (P «001, Mann-Whitney U 
test),° and this difference persisted through 
Trial 16 (p < .01, U test). (The self-ranking 
after Trial 16 was also affected by the evalu- 
ations received from the group; subjects 1n 
the HO conditions ranked themselves higher 
than those in the LO conditions—p < .001, 
U test.) When asked on the postexperimental 
questionnaire to rate the degree of their judg- 
ment ability on a 7-point scale, subjects 
rated themselves on the average 5.09 in the 
HS-HO condition, 4.34 in the HS-LO condi- 
tion, 3.50 in the LS-HO condition, and only 
2.34 in the LS-LO condition. The difference 
between any two of these means is signifi- 
cant beyond the .01 level, U test. On their 
switch-throwing responses, subjects in the 
HS conditions evaluated themselves COn- 
sistently more positively than those in the 

4A large proportion of the 124 subjects included 
in the analyses were unacquainted with the other 
3 members of their respective groups. In 279 of 
the 372 possible cases a subject indicated that he 
knew another member “not at all,” in 54 cases 
“moderately,” and in only 39 cases “fairly well 
or “very well.” 

5 All statistical tests are two-tailed. 


INTERPERSONAL EVALUATIONS 


TABLE 3 
MEAN COMPETENCE RANKING OF OTHERS 


ity Rank 1 R 
Condition (ants) & FARO) 
HC 1.32: 


HA, 
* 
MC 1.98 2. 


see y 
LC 2.69 2.58. 


Note.—A low number indicates a high rank. 
* p <.01, sign test. 


LS conditions ( < .02, U test). (There was 
no difference, however, between the HO con- 
ditions and the LO conditions.) The manipu- 
lations of the perception of others’ compe- 
tence clearly affected the paper-and-pencil 
rankings of the others, as shown in Table 3. 
Both the self-evaluation and competence 
manipulations had the intended effects, 


Experimental Analysis of Interpersonal 
Evaluations 


Balance theory predicts that for the evalu- 
ations a subject makes of the other group 
members an interaction will occur between 
the self-evaluation conditions (i.e. HS and 
LS) and the other evaluation conditions (i.e., 
HO and LO); reciprocation theory predicts 
that more positive evaluations will be sent in 
the HO conditions than in the LO conditions. 
The data relevant to these predictions are 
presented in Table 4. A three-way analysis 
of variance (Winer, 1962) on these data re- 
vealed that the F scores for all interactions 
and all but one main effect were less than 1. 
The exception was a significant main effect 
of the controlled evaluations received from 
the other group members (F = 7.01, p< 
.01). Subjects in the HS-HO and LS- HO 
conditions sent 58.196 positive evaluations, 
and subjects in the HS-LO and LS-LO con- 


TABLE 4 


MEAN PERCENTAGE oF PosrrivE EVALUATIONS OF 
OTHERS AS A FUNCTION OF EXPERIMENTAL 
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ditions sent 52.8% positive evaluations (¢ = 
2.66, df = 122, p< .01). In the LS condi- 
tions alone, where the balance and reciproca- 
tion predictions are opposite, a higher per- 
centage of positive evaluations were sent in 
the LS-HO condition than in the LS-LO con- 
dition (t= 2.62, df= 60, p<.02). This 
result supports the reciprocation theory. 

The reciprocation theory was further ana- 
lyzed by computing for each subject the per- 
centage of evaluations sent to others which 
supported reciprocation (see Table 1). Such 
an analysis revealed that reciprocation was 
affected by the perceived competence of the 
person being evaluated as well as by the 
order in which this person responded relative 
to the subject, The mean percentage of re- 
ciprocating responses was 52.0 to the HC 
members, 52.0 to the LC members, and 56.7 
to the MC members (p< .02 for both the 
MC-HC and the MC-LC comparisons, Wil- 
coxon test). The first person to respond fol- 
lowing any given subject (0-1) received 
55.8% reciprocated evaluations, the second 
person to respond (0-2) received 52.8% recip- 
rocated evaluations, the third person to re- 
spond (0-3) received 51.7% reciprocated 
evaluations. Seventy-one subjects recipro- 
cated more often to 0-1 than to 0-3, and 41 
subjects reciprocated more often to 0-3 than 
to 0-1 (p < .01, sign test). 


Correlations between Evaluations of Self and 
Others 


Another way of examining the balance 
hypothesis is to compute the correlation be- 
tween the percentage of positive evaluations 
a subject makes of himself and of others 
during the 12 experimental trials. From bal- 
ance theory one would predict that when P 
receives favorable evaluations from O (i.e., in 
the HO conditions) his evaluations of him- 


TABLE 5 
VALIDITY OF EXPERT AND PEER EVALUATIONS 


CONUIUES Source | H&HO | HSLO | L&HO* | Ls-LO 
HS-HO HS-LO Ls-HO LS-LO Expert 14 23 4 13 
a Peer 8 3 21 9 
HG 57.4 52.5 56.1 52.5 Both 9 S. 5 9 
MC 60.0 52.8 61.2 52.7 
LC | 536 54.5 60.4 518 EEG. 


a One subject in this condition fie to answer the question, 
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self will be positively related to those he 
makes of O, and when P is unfavorably eval- 
uated by O (ie. in the LO conditions) his 
self-evaluations will be negatively related to 
the evaluations he makes of O. Contrary to 
this prediction the Pearson r for subjects in 
the HO conditions was —.37 (df = 60, P< 
01) and for subjects in the LO conditions 
was +.06 (df = 60, ns). The difference be- 
tween these two correlations is significant (z 
— 244, p « 02). 


The Validity of Expert and Peer Group 
Evaluations 


Subjects’ reactions to the source of either 
favorable or unfavorable evaluations of them- 
selves may be further illuminated by consid- 
ering whether they perceived the expert eval- 
uations of the written part of the test, the 
peer group evaluations of the oral part of the 
test, or both sources of evaluation as the most 
valid indicator of a person's ability. The 
frequency data on this question appear in 

Table 5. Although there was no reliable 
tendency to view one type of evaluation as 
being more valid than another, subjects per- 
ceived as most valid that source of evalua- 
tion which gave them the higher score. 


Discussion 


Interpersonal evaluating behavior in this 
experiment was more indicative of recipro- 
cating than balancing tendencies. Whereas 
any balance motivations that may have been 
operating were not affected by the group 
conditions, reciprocation was affected both 
by the perceived competence and by the 
responding order of the subject being evalu- 
ated. The reciprocation tendency has been 
interpreted as involving a desire to gain 
social or other forms of external approval. 
Aside from the findings concerning recipro- 
cation, other evidence that such needs were 
present is that subjects perceived as most 
valid that source of evaluative information 
that was most personally rewarding to them. 

Why there was little tendency toward 
balance, at least relative to other interpreta- 
tions of the data, becomes a particularly im- 
portant question in light of the fact that 
Deutsch and Solomon (1959), in a similar ex- 
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periment, obtained support for the theory. 
One possible basis for the difference is that 
in the present experiment subjects were eval- 
uating single actions of other people, and in 
Deutsch and Solomon's experiment they were 
evaluating more general characteristics of 
other people. Balance theory may be appli- 
cable to only global conceptions of the self 
and other people and not to conceptions of 
specific acts. 

Another factor which distinguishes the 
present study from the Deutsch and Solomon 
study, and which may account for the rela- 
tively weak effects of balance motivations in 
the present investigation, is the type of social 
situation in which the subjects were involved. 
Whereas subjects in the Deutsch and Solo- 
mon experiment sent their evaluations of the 
other person to the experimenter and neither 
anticipated nor received further evaluative | 
feedback from the other person, those in the 
present experiment sent their evaluations di- 
rectly to one another and both anticipated 
and received further evaluative feedback | 
from other subjects. When a person’s evalua- 
tions can have immediate consequences on 
the behavior of other people, he may be less 
interested in achieving a balanced cognitive 
state and more interested in the instrumental 
value of the evaluations he sends as a method 
of controlling the behavior of others. 

The significant difference between the self- 
other correlations indicates that the fewer 
positive evaluations a subject gave himself 
the more likely he was to respond favorably 
to others when they positively evaluated him, 
but that this tendency was significantly te 
duced when others negatively evaluated him. 
This finding is opposite to the balance pre- 
diction. Furthermore, it may mean that peo 
ple who have little esteem for themselves, 0* | 
who at least have little confidence in certain” 
of their own abilities, also have a greater 
desire to seek and gain approval from others. 
Tf this interpretation is correct, self-evalua- 
tions may in some circumstances play a role 
in interpersonal evaluating behavior that 15 
quite different from that assumed in balance 
theory. The problem posed by this interpreta- 
tion is to distinguish the conditions unde | 
which the self-evaluation regulates a need for 
external approval from the conditions Un er 
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which it becomes a reference point for or- 
ganizing a balanced cognitive state. One 
variable which may distinguish such condi- 
tions is the degree of certainty a person has 
'about his evaluation of some ability or per- 
formance. When a person is highly certain 
about this evaluation, discrepant information 
from others may become suspect and balanced 
responses would be observed; when he is 
highly uncertain, however, tendencies toward 
balance have not yet become salient and the 
need for social approval is a more effective 
force. The fact that in the present experi- 
ment subjects’ paper-and-pencil self-evalua- 
tions changed in certain conditions from the 
first to the second rating (see Table 2) sug- 
gests that, in spite of the initial experimental 
inductions, the subjects remained somewhat 
uncertain of their self-evaluations. The ex- 
istence of such a situation could account for 
the relatively strong reciprocation effects. 

The fact that more reciprocating responses 
were directed toward the MC member than 
toward either the HC or LC member sug- 
gests that reciprocation was not entirely an 
emotional response (i.e., as suggested by the 
frustration-aggression interpretation) and 
that it involved, in part at least, cognitive— 
instrumental—strategies. If only emotional 
tendencies were operating, one might suppose 
that the strongest reactions would result 
from the evaluation of the HC member and 
therefore the greatest amount of reciprocation 
would be toward this member. However, if 
instrumental tendencies were also operating, 
one might argue, as follows, in favor of the 
result which was obtained: subjects perceived 
the evaluations of the HC member as least 
susceptible to external influence (for related 
evidence, see Thibaut & Riecken, 1955) and 
the evaluations of the LC member as least 
important; if such perceptions existed, sub- 
jects might well have concentrated their in- 
strumental, “shaping” efforts on the evalua- 
tions of the MC member. 

The finding that there were more recipro- 
cating responses directed toward the first 
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person to respond following a subject (0-1) 
than toward the third person to respond 
(0-3) may reflect either a decrement, over 
time, in the emotion produced by the evalu- 
ations received, or a decrement in memory 
which impeded the subject’s ability to 
“shape” another person’s behavior. It is also 
possible that the relation between reciproca- 
tion and order was caused by subjects’ 
adopting one type of strategy in their evalu- 
ations of 0-1 and another type of strategy in 
their evaluations of 0-3. Both strategies in- 
volve the use of evaluations to manipulate 
another person's behavior. In one case, how- 
ever, P attempts to “shape” O's behavior by 
rewarding or punishing the evaluative re- 
sponses that O has already emitted. In the 
other case, P attempts a priori to influence 
the type of behavior that O will emit; that is, 
P assumes that O is reciprocating and conse- 
quently he sends positive evaluations to O. 
In this second strategy P is using an incen- 
tive rather than a reinforcement to influence 
O's behavior. This, incentive tactic may be a 
special case of “complimentary other enhance- 
ment” which is discussed by Jones (1964) in 
his recent monograph on ingratiation. 
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THE EVALUATION OF COMPLEX SOCIAL STIMULI* 
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3 experiments are reported in which judges evaluate compound social stimuli 
composed of separable elements, The results are consistent with a model that 
treats the compound as being equal to a weighted average of its constituents, 
where the weight associated with each element is directly related to its ex- 
tremity. This model differs from Osgood and Tannenbaum’s congruity approach 
and from Fishbein’s summation model in its assumption that neutral elements 
do affect compound judgments. Moreover, the suggested model takes account 
of the number of elements to-be-combined, and suggests that other factors 
held constant, the number of contributing elements is directly related to the 
extremity of S’s resultant judgment; hence, a compound stimulus may at times 
receive a more extreme rating than any of its constituent elements. 


Our subjective beliefs about the world may 
profitably be regarded as a complex resultant, 
depending upon a variety of cues which are 
often in conflict, For example, if a given 
voter believes that Candidate X is favorably 
disposed towards the United Nations, this 
conclusion may be based on a relatively large 
pool of statements and reactions that have 
been attributed to X during the course of the 
campaign. It is, moreover, important to note 
that these statements will generally reflect a 
range of attitude positions. Until recently, 
relatively little information was available 
concerning the way in which people might 
combine such diverse inputs in forming a 
final judgment. Several studies have now 
been concluded, however, which seek to de- 
termine the “rules” whereby the judgment of 
a compound stimulus (composed of several 
separable elements) may be predicted from 
the stimulus values of its constituent parts. 

Osgood and Tannenbaum’s (1955) paper 
on the congruity model presents what is per- 
haps the best-known theoretical approach to 
this problem, For each dimension of the 
meaning space, this model states that the 
judgment elicited by a compound stimulus 
can be predicted by weighting the contrib- 
uting elements in terms of their extremity 
(ie, deviation from neutrality on a bipolar 
dimension). Stated algebraically, on any 
given scale or factor, the deviation from 


1 All statements are those of the authors and do 
not necessarily represent the opinions or policy of 
the Veterans Administration. 


neutrality of a compound stimulus (d,), 
made up of two components, a and 5, is pre- 
sumed to be predictable from the following 
formula: 


|da] 1di| 


d, = da d, 
Ia] + Ia] O * al + ja 9 
where: 
|da] and |d»| = the absolute deviation of the 


constituent elements from the neutral point 
(da) and (dy) = the algebraic deviation. 


While the Osgood-Tannenbaum model has 
not been verified in a detailed fashion, it has 
received empirical support from several 
studies which confirm the assumption that the 
average judge places disproportionate weight 
upon extreme elements when he is presented 


with multiple inputs. For example, Weiss ; 


(1963) had the subjects judge the probable 
attitude position of a series of “others,” each 
of whom had presumably endorsed three 
items which varied in favorability towards 
capital punishment. His results revealed that 
there was a consistent tendency for these 
compound judgments to be more extreme 
than might have been predicted from the 
mean scale value of the contributing ele- 
ments, Similarly, using photographs as his 
stimuli, Willis (1960) found that when sub- 
jects rated the personal attractiveness 0 
several groups, each composed of two or three 
individuals, the rating of each group (as & 
whole) was typically more extreme than the 
mean rating elicited by its constituent ment: 
bers, This effect has also been noted "M 
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studies by Kerrick (1958) and Podell and 

Podell (1963), but did not appear in experi- 

ments by Podell (1961) and by Levy and 
, Richter (1963). 

A recent attempt to evaluate the congruity 
principle (Triandis & Fishbein, 1963) yielded 
somewhat less positive results, In this study, 
Greek and American college students rated 
Several unknown persons who had been 
briefly described in terms of race, occupa- 
tion, religion, and nationality, For example, 
Subjects were to rate someone described as: 
"WHITE, COAL MINER, PORTUGUESE, SAME 
RELIGION AS YOU ARE." The results indicated 
that the obtained ratings were more readily 
predictable from the occupational informa- 
tion alone than from the congruity principle, 
which did, however, permit predictions of 
above-chance accuracy. The most accurate 
predictive model of all, however, was a sum- 
mation model, based on earlier work by Fish- 
bein (1961). 

Rather than assuming that the compound 
Stimulus represents some “balance” or 
weighted average'of the elements which com- 
prise it, Fishbein hypothesizes a simple sum- 
mation of the component evaluations. In al- 
gebraic terms his model may be represented 
by the following formula: 


x 
A, = Y, Bia; 


i=l 
where: 


o =the attitude toward the complex stimulus o 

B.=the probability that there is an association 
between Object o and some other object, 
concept, value, or goal, x; 

4: —the evaluation associated with Xi; that is, 
the subject's attitude toward the related 
object x, 

N =the number of beliefs about o; that is, the 
number of xs with which o is associated. 


Note that Fishbein's model is concerned only 
with predicting evaluation (good-bad), while 
the Osgood-Tannenbaum model presumably 
applies to other dimensions as well. 

In the Triandis-Fishbein study, the various 
beliefs (Bis) about the persons being rated 
(Le, their nationality, religion, etc.) were 
given a value of 1.00, since the subjects were 
essentially certain that these elemental beliefs 
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were part of the complex stimulus to be 
rated. Under these conditions, the individual's 
attitude towards (or evaluation of) the stim- 
ulus person was hypothesized as equaling 
some function of the sum of the evaluations 
associated with the component X;'s (the oth- 
er's race, religion, nationality, and occupa- 
tion). While the summation model was more 
highly correlated with the obtained evalua- 
tions than any of the other approaches tested, 
this form of evidence is unfortunately equiv- 
ocal, since the correlations would have been 
exactly the same had the authors used the 
mean evaluation (rather than the sum) as 
their predictor. 

One interesting derivation from the sum- 
mation model holds that, “each new com- 
ponent, if it is positively evaluated, adds 
some positive evaluation to the complex 
stimulus [Triandis & Fishbein, 1963, p. 
451]." This enhanced evaluation should oc- 
cur even if the new positive component has a 
less positive evaluation than the average of 
the components previously given; such an 
effect has in fact been reported in two sepa- 
rate studies (Fishbein & Hunter, 1964a, 
1964b). In these experiments, subjects were 
given different amounts of information about 
a fictitious person. All of the information was 
positive; however, it was given in such a 
sequence that each additional item was 
somewhat less positive than that which pre- 
ceded it. As a result, subjects who were 
given the greatest amount of information re- 
ceived items which, on the average, were less 
positive than those given less information; 
the more information given, however, the 
greater the swm of the elements and the more 
positive should be the subject’s overall im- 
pression. Results from these studies sup- 
ported the summation model; that is, the 
subject’s evaluation became more positive as 
the total amount of positive information was 
increased, even though the average value of 
this information became less and less posi- 
tive. On the other hand, Anderson ( 1965) 
reports a similar study in which the ob- 
tained results clearly favored an averaging 
model, rather than a summative one, 

Summation theory also leads to the related 
prediction that when the subject is given a 
compound stimulus whose components are all 
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located on the same side of the evaluative 
scale, the compound should receive a more 
extreme rating than any of the components. 
Kerrick (1958) and Weiss (1963) have re- 
ported that this result is, indeed, often ob- 
tained. Returning briefly to the Osgood- 
Tannenbaum approach, it should be noted 
that despite the emphasis which their model 
places upon extreme items, this approach 
assumes that the compound stimulus repre- 
sents a weighted average, and hence should 
never be more extreme than the most polar- 
ized of the constituent elements. 

The studies which are reported below were 
designed to obtain further data relevant to 
an understanding of how multiple inputs 
are typically combined to yield a single 
judgment. We were, moreover, particularly 
interested in what we termed the “extremity 
effect"—the tendency to place particular 
emphasis upon extreme elements in judging 
compound stimuli (see Weiss, 1963, and 
Willis, 1960, for representative experiments 
showing this effect). H 

In undertaking this research, it seemed 
particularly important to take careful ac- 
count of the scaling assumptions which were 
involved in the measurement procedures. 
Most previous studies in this area have uti- 
lized some variant of the method of equal- 
appearing intervals, a technique based upon 
the untestable assumption that the judge will 
(as instructed) divide the total attitude con- 
tinuum into a set of equal subjective seg- 
ments, Unfortunately, the effects noted in 
these studies may reflect, to an unknown 
degree, certain inadequacies of the equal- 
intervals method, rather than any basic cog- 
nitive characteristics of the subjects. For ex- 
ample, Figure 1 depicts a segment of a 
bipolar rating scale in which, despite the 
experimental instructions, the various rating 
categories do not represent subjectively equal 
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1— 2 eM 10 
SUBJECTIVE SCALE 
Fic. 1. A hypothetical case in which inequality of 
subjective intervals leads to an apparent extremity 


effect, even though the input elements are actually 
being averaged. 
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segments on the subjective continuum; in this“ 
case, the judge has established a scale in 
which the extreme :categories: are broader 
(ie. encompass a wider subjective range) 
than those near the middle of the scale; 
Given a subjective scale of this sort, a judge 
who is actually averaging input elements 
would exhibit what might erroneously be 
interpreted as an extremity effect. Thus, in 
Figure 1, a pair of stimuli from Categories 1 
and 3 (S; and Sg) is “seen” as being midway 
between its constituents (Smean); yet, be- 
cause of the inequalities among the subjec- 
tive category widths, the pair receives a 
rating of 1. While several alternative scaling 
approaches are available (Torgerson, 1958), 
it is disheartening to recognize that each of 
these methods involves underlying assump- 
tions which might not, in fact, be warranted. 
As a result, we were interested in applying 
scaling techniques that would clearly re- 
flect the cognitive behavior of our subjects 
without placing undue reliance on the initial 
scaling assumptions which we chose to accept. 

In addition to the issues discussed above, 
two of the studies in this series were con- 
cerned with the possible effects of the sub- 
ject's own attitude upon his judgments of 
compound stimuli. Our reasoning here started 
with the assumption that conflicting inputs 
generally result in ambiguous stimuli; since 
interpretative distortions are thought to be 
most prominent under these conditions, we 
were led to the prediction that judgmental 
effects attributable to attitude differences 
would be more pronounced when compound 
stimuli, as opposed to elemental stimuli, were 
presented for judgment. 


EXPERIMENT I 
Method 


Experiment I was conducted in two parts. First, 
a group of college students rated the concept college 
fraternities plus several filler concepts on six evalua- 
tive scales, each of which contained 11 rating cate- 
gories ranging from 0 (good) to 10 (bad). The 
scales were: kind-cruel, worthless-valuable, unfair- 
fair, honest-dishonest, nice-awful, and bad-good. 
Attitude toward fraternities was quantified by 
summing across scales and the resulting scores were 
employed to divide the subjects into six distinct 
and relatively homogeneous attitude groups (n= 
13 for each group). 


l 
l 
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When the attitude assessment had been com- 
pleted, the rating task was explaihed. Subjects were 
told that they would be presented with several 
pairs of statements that had been endorsed by 


- various students as best reflecting their attitudes 


towards fraternities. The subject's task was to con- 
sider the endorsements of each student and then 
indicate this “other’s” probable attitude toward 
fraternities by responding:to a good-bad scale as he 
(the other) might have done. A similar set of in- 
structions was employed to obtain ratings of the 
single statements which comprised the pairs; half 
the subjects rated the pairs first and then the single 
statements, and half following the reverse sequence. 

To emphasize the nature of the subject’s task, 
each statement (or pair of statements) was pre- 
sented in the following format: 


Student H. R.: College fraternities are hopelessly 
out of date. 
The good and bad points of col- 
lege fraternities balance each 
other. 

Student H. R. probably sees fraternities as . . . 


Good >: 7: ir SEPT UN D 


Stimuli, Twenty-two statements were selected 
from Remmers and Silance's (1934) generalized 
scale for measuring attitudes toward any institu- 
tion. They were chosen to represent the complete 
spectrum of attitude positions and care was taken 
to include only those items that could be readily 
applicable to college fraternities. The 22 items were 
divided into two parallel subsets of 11 items each; 
thus there were two items to represent each step 
along the attitude continuum, as scaled by Rem- 
mers, 

In constructing pairs of items, the following re- 
strictions were employed: 

1. The items included in each pair were sepa- 
rated by 0-5 scale units. Discrepancies of greater 
magnitude were avoided in order to reduce in- 
credulity (Osgood & Tannenbaum, 1955). 

2. The order of presentation was balanced so that, 
for example, pairs composed of a statement at 
Position 3 and another at Position 1 were pre- 
sented in both of the possible sequences (ie. 3-1 
and 1-3). 

3. The single statements and the pairs were each 
presented in a fixed random order, to balance out 
any systematic order effects that might affect the 
subjects’ responses as they proceeded through the 
experimental task. 


Results 


The data from this experiment were ana- 
lyzed in several ways. First, assuming that 
the judges were employing a subjective scale 
Whose intervals were in fact equal, an analy- 
Sis of variance design was adopted to com- 
Dare the responses of the various attitude 


groups when presented with single statements 
and with statement pairs. Two important 
effects were noted: (a) attitude did not have 
a significant effect upon judgment; neither 
the main effect nor any of the interactions 
involving attitude reached conventional 
levels of significance. (b) Statement pairs 
tended to be more extreme than the mean 
value of their constituent items. 

The failure to find significant attitude 
effects was particularly important methodo- 
logically, since it permitted us to combine the 
six attitude groups in subsequent analyses. 
This first analysis was not completely con- 
vincing, however, with regard to the extremity 
effect. For reasons noted above (see Figure 
1), we were particularly concerned about 
“built-in” effects which might be associated 
with the method of equal-appearing intervals 
and sought to devise a technique based on 
rather minimal ordinal assumptions. 

An ordinal analysis. In any scaling model, 
the category containing the mean of an opin- 
ion pair must fall somewhere between the 
two contributing items. It was therefore safe 
to infer an extremity effect wherever the pair 
judgment was more extreme than either of 
the constituent elements; similarly, a neu- 
trality effect could be inferred whenever the 
pair elicited a judgment less extreme than 
either of its constituents. More generally, if 
we assume that there is a roughly symmetric 
dispersion of pair judgments, then we can 
infer the relative location of the distribution 
mean by observing its tails, In particular, if 
the subject's reaction to the given type of 
item pair results in a higher proportion of 
responses which are more extreme than 
either of its components, as compared with 
the proportion of responses less extreme than 
the components, then we may assume that the 
central tendency of these judgments is more 
extreme than the midpoint of the component 
elements. In brief, the approach which was 
finally adopted focused upon those cases in 
which the subject’s response to an opinion 
pair was either more extreme or more neutral 
than his ratings of the constituent elements; 
instances in which a subject’s response was 
within the range defined by the constituents 
were not considered. 
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TABLE 1 


NUMBER OF SUBJECTS SHOWING NET EXTREMITY AND 
NxurRAuTY Errects IN RESPONSE TO Homo- 
GENEOUS PAIRS 


n showing: 
alt 
Compo- Poa tor aren. 
b y r ty versus 
nents Net = pir extremity neutrality 
effect. effect alites 
sponses 
0, 0 » 
1,1] 19 14 10 i: 
202. | 149 18 io i 
3, 3 26 1 8 01 
44] 14 5 6 faa 
5 5 b 
6 6| 15 2 S d 
De |i. \ 26 6 ) 2 
8 8 22 9 15 .05 
9, 9 19 6 10 05 
10, 10 => je qu M 


a Two-tailed sign test. 

b This analysis cannot be applied to pairs that include one 
or more components at either of the extreme positions (0 or 10), 
or to pairs whose constituents are essentially "balanced" at 
the midpoint (5). For pairs including elements at Positions 0 
or 10, it is obvious that the subject's response cannot be more 
extreme than both constituents; for pairs that "balance" at 
the midpoint, any scorable response is more extreme than the 
constituents and hence this case too Jails to provide an appro- 
priate test for the extremity effect. 


In accordance with the logic outlined above, 
the following approach was adopted: 

1. For each subject, the various opinion 
pairs were grouped in terms of the subject's 
response to the individual items when these 
were judged as singles. For example, for each 


TABLE 2 


NuMBER or SUBJECTS SHOWING NET EXTREMITY AND 
NEUTRALITY EFFECTS IN RESPONSE TO ADJACENT 


Pargs 
n showing: e i 
Compo- quency of 2 GS 
nents extremity | for extrem- 
Net ex- | Netneu- | and neu- | ity versus 
tremity | trality tality | neutrality 
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a Two-tailed sign test. 
b See Table 1, Footnote b. 
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subject, we looked at all pairs comprised of 
elements to which he had given ratings of, 
let us say, 2 and 3. 

2. Since a given judge might have re- 
sponded to many pairs whose elements had 
been given identical ratings, we next deter- 
mined whether he more frequently exhibited 
extremity or neutrality responses on a given 
type of opinion pair. 

3. Finally, for each type of opinion pair, a 
count was conducted to determine the num- 
ber of judges for whom the extreme responses 
dominated the neutrals, and the number of 
judges for whom the reverse was true. 

Table 1 presents the results obtained from 
homogeneous pairs of statements; Tables 2 


TABLE 3 


NUMBER or SUBJECTS SHOWING Net EXTREMITY AND 
NEUTRALITY EFFECTS IN RESPONSE TO PAIRS 
SEPARATED BY ONE RATING CATEGORY 


H n with 
n showing: equal fre- | P values 
Compo- quency of | for extrem- 
nents extremity | ity versus 
Extremity | Neutrality | and neu- neutrality^ 
effect effect trality 
0, 2 ES tis = — 
1, 3 35 8 10 .01 
2,4 27 13 6 05 
3, 5 36 17 1 05 
4 6 b 
5, 7 23 25 21 ns 
6, 8 27 4 14 01 
7, 9 33 11 17 .01 
8, 10 b 


* Two-tailed sign test. 
b See Table 1, Footnote b. 


and 3 present the results from pairs com- 
prised of statements in adjacent categories, 
and statements separated by one rating 
category, respectively. For example, the sec- 
ond row of Table 1 indicates that in re- 
sponding to pairs in which both constituents 
were rated as being in Category 1, 19 sub- 
jects produced more extremity than neutrality 
responses, 14 subjects showed a preponder- 
ance of neutrality responses, and 10 showel 
an equal frequency of extremity and neu- 
trality reactions. The results of this compar 
son thus reveal an overall extremity effect 
(more people show extremity than neutral- 
ity), although the trend is not significant. 
Inspection of Table 1 indicates that in each 
of the eight homogeneous pairs listed, & 
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PROBABILITY THAT PAIR POLARIZATION > 
POLARIZATION OF MORE EXTREME ELEMENT 
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f 


00 
LESS POLARIZED ELEMENT 
Fic. 2. Relative frequency of pair judgments 
which are at least as extreme as the more polarized 
constituent; all constituents are drawn from the 
positive half of the continuum. 


tremity effects were more common than neu- 
trality effects; moreover, in five of these 
cases, the obtained data differed significantly 
from a .50-.50 split. 

Overall consideration of Tables 1, 2, and 3 
provides compelling evidence for the extrem- 
ity effect. In 20 of the 22 instances presented, 
extremity effects are more common than neu- 
trality reactions; unfortunately, since the 
same subjects appear in the various compari- 
sons, these observations are not independent 
and hence cannot be evaluated by our usual 
methods of statistical inference. The strength 
of the extremity effect is, however, further 
evidenced by the fact that in 16 of these 22 
nonindependent tests of the hypothesis, a 
significant (p < .05) extremity effect was 
obtained, 


Discussion 


Do these data fit the congruity principle? 
As shown in Tables 1, 2, and 3, the data 
clearly reveal the predicted emphasis on ex- 
treme elements. On the other hand, the fact 
that homogeneous pairs typically receive 
more extreme ratings than their constituents 
(see Table 1) is incompatible with the 
Osgood-Tannenbaum model. According to 
their approach, a pair composed of equally 
polarized elements should receive the same 
rating as its constituents. It should be noted, 
moreover, that the summation model is quite 
compatible with this aspect of the data. 

Do the data, then, fit the summation 
model? One test for summation is to restrict 
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our attention to those pairs whose elements 
both come from the same side of the evalua- 
tive continuum; if in fact the rating of a 
compound stimulus can be predicted by sum- 
ming the evaluations of its constituent ele- 
ments, we should find that the pair always 
receives a rating at least as extreme as its 
most polarized constituent. 

To test this prediction, we first classified 
the statement pairs in terms of subjects’ re- 
sponses to the individual constituents. For 
example, we considered within one group all 
those instances in which the constituent ele- 
ments had received ratings of 1 and 3. Note 
that within such a classification we would 
include many different pairs of statements; 
moreover, a pair which was classified in the 
1-3 category for one subject might appear in 
the 1-4 category for another. Having con- 
structed this classification system, we next 
noted the relative frequency with which a 
given type of pair was given an extreme 
rating; that is, a rating as extreme as or more 
extreme than its «most polarized constituent. 
The results of this analysis are graphically 
presented in Figures 2 and 3; Figure 2 shows 
the relative frequency of these extreme 
ratings for pairs on the positive side of the 
continuum, while Figure 3 reports results 
from the negative side. Inspection of Figure 2 
shows, for example, that when an element 
which received a O rating was paired with 
another 0, the compound stimulus received 
an extreme rating (as defined above) in 82% 
of such instances. When a 0 element was 
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PROBABILITY THAT PAIR POLARIZATION > 
POLARIZATION OF MORE EXTREME ELEMENT 
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LESS POLARIZED ELEMENT 


Fic, 3. Relative frequency of pair judgments 
which are at least as extreme as the more polarized 
constituent; all constituents are drawn from the 
negative half of the continuum, 
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paired with a 1, extreme ratings were ob- 
tained in 61% of the cases. 

Figures 2 and 3 are quite orderly and 
consistent with one another in showing that 
extreme ratings are most likely to occur when 
the constituent elements of a pair are located 
relatively close to one another. Thus, both for 
positive and negative stimuli, the probability 
of obtaining an extreme rating is greater 
than .50 for all homogeneous pairs and for 
all pairs drawn írom adjacent scale cate- 
gories, On the other hand, extreme ratings 
occur rather infrequently (? usually < .50) 
when the constituents are separated by one 
or more categories. This last aspect of the 
data is clearly inconsistent with the sum- 
mation model; that is, contrary to summation 
theory, when the contributing elements are 
both drawn from the same side of the evalu- 
ative continuum, if they are quite different 
in intensity, the pair is generally seen as 
less extreme than the more polarized of its 
constituents. 

There is yet another sense in which the 
present data conflict with both summation 
theory and the congruity principle. Both 
theories suggest that the rating of a pair 
composed of one polarized and one neutral 


TABLE 4 


VOCABULARY DEFINITIONS USED AS STIMULI IN 
EXPERIMENT II 


1. DONKEY: a type of four legged animal. (M = 1.91 
Spee) yp r legged animal. (. , 


2. BACON: product of a pig. (M = 1.86, SD = 1.54) 


3. BRIM: pertaining to a hat; something that 
trudes. (M = 2.91, SD = 148) M taat P 


: 4. MUZZLE: piece that covers the mouth if they bite; 
5 pees ys individual from being bitten. (M — 3.14, 


5. NUISANCE: something being d 
should be corrected. (M = 5.09, SD S 5 = 


6, ENVELOPE: something you put a letter i 
body can read it. (M = 504 SD = 2.63) ME. 


7. GAMBLE: be unn and vul, in- 
ment, all life's a gamble. (M — 700, SD. a pd 


8. AMANUENSIS: a Latin term for th i 
(M = 6.95, SD = 2.98) e e 
9. PEWTER: could be some sort of a it. 
(M = 844, SD = 177) eas 


10. FABLE: the right thing when somebody else i 
equally right. (M — 791, SD — 2,62) VA NN 
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element should be the same as the rating of 
the polarized component. Inspection of Fig- 
ures 2 and 3, however, indicates that when 
neutral elements (at Position 5) are included 
in a pair, the evaluation of the compound 
is usually less extreme than the evaluation of 
the polarized coelement, (The probability of 
extreme ratings is less than .50 in 8 of the 
10 relevant instances.) 


EXPERIMENT II 


The results of Experiment I supported the 
conclusion that judgments of complex social 
stimuli are typically more extreme than might 
be anticipated from a simple averaging of 
the constituent elements. The primary pur- 
pose of Experiment II was to determine if 
this principle was applicable to other judg- 
mental domains; in particular, this experi- 
ment was concerned with judgments in the 
area of psychopathology. Stated briefly, we 
wished to see if clinical judgments based on 
multiple inputs were more markedly affected 
by indicators of severe pathology, than they 
were by information indicative of less severe 
pathology. 

Note that the concept of "extremity" in 
this situation is unidirectional; that is, an 
input indicating severe pathology is more ex- 
treme than one indicating less pathology, re- 
gardless of the absolute pathology indicated 
by each. In contrast, the concept of extremity 
is bidirectional for social inputs; if two state- 
ments are favorable, the more favorable is 
the more extreme, while if two statements are 
unfavorable, the less favorable is the more 
extreme, 


Method 


Ten schizophrenic vocabulary responses were 
selected from a study conducted by Arnhoff (1953). 
He had 22 experienced clinical psychologists rate 
222 schizophrenic vocabulary responses in terms of 
the “severity of the disorder of the thinking process 
on a scale from 1 (minimal disorder) to 11 (maxi- 
mal disorder). Of the 10 responses chosen for this 
study, 2 had a mean rating of 2 + .15, 2 had a mean 
rating of 3--.15, 2 had a mean rating of 5+ -15 
2 had a mean rating of 7.15, and 2 had a mean 
rating of 8.15, Within each of these specified 
intervals, the response with the largest variance of 
ratings was selected and the response with the 
smallest variance of ratings was selected. (One de 
parture from this ideal criterion was necessary 1 
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order to avoid using the same vocabulary word 
twice.) The 10 responses used in this study are 
presented in Table 4 together with their mean 
ratings and the standard deviation of these ratings. 

A two-part questionnaire was constructed, Instruc- 
tions to the first part read as follows: 


Following are ten vocabulary responses of pa- 
tients unanimously diagnosed as schizophrenic by 
experienced clinicians; these responses were ob- 
tained from five types of patients. 

Type A: ONLY SLIGHT EVIDENCE OF 
THINKING DISORDER; ambulatory; copes well 
with social reality outside hospital. 

B: MILD THINKING DISORDER; ambula- 
tory, copes with social reality outside hospital in 
minimal fashion. 

C: MODERATE THINKING DISORDER; 
hospitalized; occasionally unable to cope with 
social reality outside hospital. 

D: SEVERE THINKING DISORDER; hos- 
pitalized, often unable to cope with social reality 
outside hospital. 

E: SEVERELY REGRESSED THINKING; 
back ward hospitalization; unable to cope with 
social reality outside hospital. 

Two of the ten responses were given by each 
type of patient. After examining all ten responses, 
please try to judge which two came from each 
type of patient; where evidence is inconclusive, 
please make an educated guess. 


The responses were then listed in alphabetical 
order, 

The instructions to the second part of the ques- 
tionnaire read: 


Actually, some of these responses were made by 
the same patient. 

Following are pairs of responses; some of these 
pairs were obtained from the same patient and 
some were not. Please treat all pairs, however, 
as if they were obtained from the same patient 
and judge which type of patient would be most 
likely to give each pair. 

Do not refer back to your earlier judgments. 
For the convenience of the subject, the five 

types of patients were again listed; then, every 
pair of vocabulary responses was presented (number 


of pairs = 10% 9— 45). Of the 43 pairs presented, 


5 were homogeneous (the two definitions had pre- 
viously been judged as belonging in the same cate- 
Eory) and 40 were heterogeneous. Given five 


5x4 
response categories, there are , or 10 types of 


heterogeneous pairs (AB, AC, BC, BD, etc); each 
of these 10 types was represented by four different 
pairs. 

The questionnaire was mailed to 50 graduate 
Students in clinical psychology who had completed 
a course in advanced diagnostic testing; these gradu- 
ate students came from the University of Michigan, 
Michigan State University, and Wayne State Uni- 
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EXTREMIST MINUS NEUTRALIST RESPONSES 


Fic. 4. Distribution of difference scores (number 
of extremist minus number of neutralist responses). 


versity. In addition, the questionnaire was mailed 
to 20 practicing clinical psychologists from the same 
three institutions. 


Results 


Thirty-five graduate students and nine 
clinical psychologists returned the question- 
naire, Since the response patterns of these 
two groups did not differ in any important 
respects, they were analyzed together. 

In analyzing these data one might wish to 
employ the methodology developed for the 
ordinal analysis reported in Experiment I. 
That is, we might restrict our attention to 
those instances in which a pair was placed in 
a category either more extreme or more neu- 
tral than the scale values of its contributing 
elements (an example of this would be the 
placement of a BC pair in Category A, or 
in Categories D or E). This approach 
proved impractical, however, since the over- 
whelming majority of the heterogeneous 
pairs were placed in one of the two categories 
from which the elements were drawn, or in 
some intermediate category. (As in Experi- 
ment I, the frequency of these intermediate 
judgments argues against a summation 
approach to the present data.) 

Our earlier approach was therefore modi- 
fied in the following manner. Consider a 
heterogeneous but nonadjacent pair of defini- 
tions, jk, in which & is the more extreme 
element? It is reasonable to assume that the 
average value for such a nonadjacent pair 
typically would lie neither in Category j nor 
in Category k, but instead would be located 
within one of the categories lying between 


2 Note that these pairs are defined by the responses 
of the individual subject; that is, a pair is defined 
as a jk pair for a given subject if he placed one of 
the constituent definitions in Category j and one in 
Category k. 
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j and k. Given this assumption we can test 
for an extremity effect by comparing the 
number of & responses or responses more 
extreme than k, with the number of j re- 
sponses less extreme than j. The former type 
of judgment may for present purposes be 
classified as extreme, the latter type, neutral. 

The number of extreme minus the number 
of neutral judgments for all heterogeneous, 
nonadjacent pairs was therefore computed 
separately for each subject. The distribution 
of these differences is presented in Figure 4. 

Extreme and neutral judgments are 
equally likely by chance alone. However, the 
Wilcoxon matched-pairs signed-ranks test 
(Siegel, 1956, p. 75) suggests that extreme 
judgments are more common (z= 1.84, p 
« .05, one-tailed test). A nonparametric test 
was employed here since the sample included 
one extremely aberrant subject who made 
14 more neutral judgments than extreme 
judgments. If this subject is excluded, a 
parametric / test indicates with much higher 
reliability that the mean difference score is 
positive—that is, extreme judgments are more 
common (£ = 2.24, p < .025, one-tailed test) 
than neutral judgments when we consider 
heterogeneous pairs. On the other hand, 
homogeneous pairs drawn from Categories B, 
C, and D do not yield this effect (homogene- 
ous pairs from Categories A and E do not 
provide suitable test situations for the ex- 
tremity effect because of end-effect restric- 
tions). For Stimulus Pairs BB, CC, and DD, 
judgments which are more extreme than the 
average input actually occur somewhat less 
frequently than judgments which are less 
extreme (though not significantly so); for 
example, BB pairs are placed in Category A 
25% of the time, and only 23% of the time 
in Category C. 


Discussion 


The results from this experiment are quite 
consistent with those of Experiment I, when 
we restrict our attention to heterogeneous 
pairs; in both studies, subjects apparently 
weighted input elements in accordance with 
their extremity. On the other hand, an ex- 
tremity effect was also obtained for homo- 
genenous pairs in Experiment I, but not in 
Experiment II. It is not clear at this point 
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why the homogeneous items should yield dii- 
ferent results as we move from the social 
to the clinical domain; on the other hand, 
methodological differences (e.g., number of 
response categories was changed from 11 
to 5) may be important here. 

Because of the forced-choice methodology 
employed in the first part of Experiment II, 
there is a possible artifact which should be 
considered, Suppose the subjects perceived all 
of the vocabulary responses as coming from 
patients of the more extreme (i.e., pathologi- 
cal) types. When asked to place two defini- 
tions in each category, the subjects would 
be forced to include some definitions in the 
less extreme categories that they subjectively 
felt should really belong in a more extreme 
category. The judgment of the pairs would 
then show what looks like an extremity 
effect, simply because the  forced-choice 
method had produced a systematic under- 
estimation of the pathology associated with 
the various definitions; judgments of the 
pairs, on the other hand, were not obtained 
within the forced-choice format, and hence 
even if the judges followed a strict averaging 
approach, their responses might appear to 
be displaced toward the pathological end of 
the scale. 

We should hasten to add that there is no 
empirical evidence to support this account. 
Moreover, the absence of an extremity ef- 
fect on the homogeneous pairs is inconsistent 
with the proposed line of reasoning. Given 
the similarity between the present results and 
those of the other experiments in this series, 
we are thus inclined to interpret Experiment 
II as supporting the hypothesis of an extrem- 
ity effect in complex clinical judgments when 
the input elements are heterogeneous 2$ 
indicators of pathology. It should also be 
noted that given the relative infrequency of 
responses more extreme than the constituents, 
these data seem consonant with some variant 
of the averaging approach, rather than with 
a summation model, 


EXPERIMENT III 


While Experiments I and II strongly SUP- 
ported the presence of an extremity effect, 
there was some question as to the generality 


i 
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i. of the phenomena, for both of these experi- 
“ments employed a categorical method of data 
„collection. That is, subjects were instructed 
to make their judgments by placing each 
stimulus in its appropriate category along 
a dimension selected by the experimenter. Tt 


therefore seemed worthwhile to see if similar 


results might be obtained in a setting where 
the experimenter did not call attention to 
any one of the many dimensions which 
characterized the stimuli, 

To accomplish this aim, a third experiment 
was designed using a noncategorical method. 
Subjects were presented with sets of three 
opinion statements, each set of three presum- 
ably reflecting the views of some unknown 
student. Subjects rank ordered these items 
in terms of their effectiveness in representing 
the other's “true” attitude. In scoring the 
obtained data, we started with the assump- 
tion that the judge would implicitly com- 
pare each item with his overall understanding 
of the other's position, and would select the 
item that seemed closest to this hypothetical 
point as being most representative. If ex- 
treme elements (in terms of evaluation) are 
given particular emphasis, the resultant 
should typically be located at a point rela- 
tively close to the most extreme element and 
somewhat farther from the least extreme ele- 
ment; an extremity effect was therefore in- 
ferred whenever the most extreme element 
of the three was selected as being more repre- 
sentative of the other's views than the least 
extreme element. Note that this method 
does not consider the middle element. For 
example, even in these cases where the middle 

item was regarded as most representative of 
all (closest to other's true position), we 
might still infer an extremity effect if the 
most extreme element was consistently judged 
to be more representative than the least ex- 
treme element, since this would suggest that 


8 While it might have been possible to employ 
this technique by presenting subjects with two, 
rather than three opinion statements, it was felt 
that the use of two statements might meet with 
resistance, If subjects were simply "averaging" the 
inputs, they might rationally select one statement 
Out of three as being closer than the others to this 
hypothetical average point; given two statements, 
9n the other hand, the choice would presumably be 
equidistant from their average. 
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TABLE 5 
CONSTITUENTS FOR THE Six CLASSES OF TRIPLES 


Triple class Constituent elements 


H, ++ 
0 


, 


x 


DAWWAWWO 


Note—---- = strongly pro-fraternity, + = moderately 
pro-fraternity, -— = strongly anti-fraternity, — = moderately 
anti-fraternity,0 = neutral. 


the endorser was perceived as holding views 
which were located somewhere between the 
middle and extreme items. 

As in Experiment I, the respondents’ atti- 
tudes again served as an independent vari- 
able; while our earlier data did not yield 
any evidence of attitude effects, it seemed 
worthwhile to explore this issue once more. 
Consequently data were collected from 
three groups of subjects who varied in their 
attitude towards college fraternities. 


Method 


Attitude assessment. Experiment III was con- 
ducted in two sessions. In the first session, 60 
college students (who were paid for participating) 
rated the concept college fraternities plus several 
filler concepts, using six evaluative scales *; each 
scale included nine rating categories (scored 0-8). 
Attitudes toward fraternities were assessed by sum- 
ming across scales, following which three groups 
of subjects (pro, neutral, and anti fraternity) were 
selected. There were 20 subjects in each group, and 
their mean attitudes, as assessed on a scale ranging 
from 0 to 48, were 13.3, 26.1, and 38.3, for the 
pro’s, neutrals, and anti’s, respectively. 

Test booklets. In the second experimental session, 
subjects were presented with 70 opinion triples 5; 
each triple presumably reflected the statements en- 
dorsed by a University of Michigan student. The 
subjects were further told that after making three 
choices, each of the endorsing students had picked 
the one statement that best conveyed his views, 
and the one statement that was least adequate for 
this purpose. With this as a background, subjects 
were asked to pick out the two statements selected 
by the “other”; that is, the statement which the 
endorsing student regarded as most effectively repre- 


4 The same bipolar adjectives were used here as in 
Experiment I. 

5 About half of these triples were included for 
exploratory purposes; as detailed in Table 5, only 
36 of the triples were used in the final data analysis. 
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TABLE 6 
ANALYSIS OF VARIANCE FOR EXPERIMENT II 


Source df MS F 
Attitude (A) 2 33 | «1.00 
Test booklet (B) 1 2.62 9.70** 
AXB 2 12 2.67* 
M»157 1 2.98 11.045 
Subjects within group (C)| 54 27 
Triple class (D) 5 27 1.08 
AXD 10 18 | «1.00 
BXD 5 1.02 4.08** 
AXBXD 10 401 | «1.00 
DXC 270 25 — 

5$ S005 


senting his attitude, and the statement which he 
had chosen as being least effective in this regard. 

Classes of triples. Since it seemed possible that 
the effects we obtained might partly depend upon 
the evaluative content of the stimulus material, the 
opinion triples were categorized into six classes, 
defined by the scale segments from which their 
constituents were drawn. Table 5 presents a sum- 
mary description of the six item classes; there were 
six triples within each class. For example, among 
the triples in Class I, the constituent elements all 
represented strongly pro-fraternity sentiments. Class 
II triples were of two subtypes; half of these 
triples (II*) included two strongly pro-fraternity 
items and one neutral, while the other half (II") 
had only one strongly favorable item and two 
neutrals. 

It may readily be seen that within Classes I, II, 
and III the triples are generally pro-fraternity, 
with Class I being most extreme; triples within 
Classes IV, V, and VI fall on the negative side of 
the scale, with Class VI being most extreme. Note 
that Classes II and III are about equal in average 
favorability, as are Classes IV and V. Results from 
these classes were, however, considered separately, 
since it seemed possible that the data might partly 
reflect differences in the heterogeneity of items 
within each triple; that is, the triples within Classes 
1 and IV, which include both extreme and neutral 
items, reflect a greater degree of heterogeneity than 
Classes III and V, each of which includes items 
from only one segment of the attitude continuum 
(moderately pro- and moderately anti-fraternity, 
respectively). 

Scoring the test booklets, The subject’s selections 
for the most and least representative items within 
each triple enabled us to infer his rank ordering 
for the three elements. Each triple was then scored 
by noting whether the most extreme of the con- 
stituent items was regarded as more representative 
of the "other's" views than the least extreme item.6 


ê The rank order of the items was determined by 
their mean favorability (across all subjects) in Ex- 
periment I. e 
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Using this system, a separate count was conducted 
to represent the subject's performance on each of 
the six classes of triples. 

Controls for order effects and item content. 
Within each class, the order in which the elements 
were presented varied from one triple to the next, 
For example, among the six triples in Class I, the 
most favorable item appeared twice in Position 1, 
twice in Position 2, and twice in Position 3. 

To reduce the possibility that the results obtained 
within any one class of triples might be attributable 
to the particular items and combinations of items 
that were selected, two parallel test booklets (A 
and B) were constructed. The two booklets differed 
solely in terms of the specific triples that were 
included as representatives of the various classes, 
Half the subjects in each attitude group were pre- 
sented with Booklet A and half with Booklet B. 
The triples within each booklet were arranged in 
a fixed random order, with the restriction that no 
more than two triples from a given class appeared 
on the same page (there were four triples on each 
page). 

Statistical analysis. The design of the experiment 
involved three independent factors (the subjects 
attitude, the test booklet he received, and the six 
categories of triples); one of these factors (triple 
classification) represented a within-subjects variable, 
The data which were analyzed thus included six 
scores for each sübject; each score represented the 
percentage (relative frequency out of the six triple 
in a given class) with which the subject chose the 
most extreme item in a triple as being more repre 
sentative of the other's views than the least extreme 
item. If the subjects followed a simple averaging 
approach, we would expect them to choose the most 
extreme item over the least extreme item in 50% of 
the triples. An arc-sine transformation was applied 
to the obtained data before conducting the actual 
analysis, 


Results 


The analysis of variance is summarized in 
Table 6. In addition to partitioning the vari- 
ance attributable to the different experi- 
mental treatments, following a procedure 
adapted from Collier (1958), Table 6 also 
tests the difference between the grand mean 
of the obtained data (M) and the mean 9 


1.57 which would be anticipated under the | 


null hypothesis (ie., if the extremity effect 
were to occur with a probability of .50, as 4 
purely chance phenomenon). 


The most important finding here is the 


fact that the grand mean of the transform 

data significantly exceeds the transforme 
score associated with the null hypothesis 
(2 < .005); that is, the extremity effect 0C 
curs too frequently for us to attribute it t0 
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random variation in our subjects’ responses.’ 
The other significant effects are of limited 
interest, since they are all dependent upon 
variation between the two test booklets, and 
thus mainly reflect the experimenters’ in- 
ability to assemble alternate test forms that 
were truly equivalent. When these secondary 
results are studied more closely, it is obvious 
that they derive from the fact that although 
the extremity effect appears with considerable 
clarity when we consider the total set of 
pooled data, there are occasional inconsisten- 
cies within the individual test booklets. For 
example, when we inspect the data from 
Booklet A separately, we find that at a 
descriptive level, the extremity effect is ob- 
tained in only three of the six classes of 
triples; when we consider data from the 
two booklets combined, however, the results 
are in accord with the extremity effect for 
each of the six classes. 

Finally, it should be noted that the attitude 
variable did not account for a significant 
proportion of the variance in this experi- 
ment, nor did its associated interaction terms. 


Discussion 


Taken as a whole, these results suggest 
that the extremity effect is indeed a replicable 
phenomena, even when the evaluation dimen- 
sion is not highlighted by the experimenter’s 
instructions, It is apparent, however, that 
within each class of triples there is a sizable 
variation in response as we move from one 
sample of items to the next. As a result, when 
we attempt to draw conclusions about a given 
class from a limited number of instances (as 
we do when we focus on the data from a 
single test booklet), we may meet with ir- 
regularities, As we expand our sample of 
triples within each category, on the other 
hand, these irregularities tend to disappear, 
as shown by the results for our pooled data. 


TA further analysis of these data indicated that 
for 39 of our 60 subjects, the average transformed 
Score (across the six classes of triples) exceeded the 
score associated with the null hypothesis; this 
means that 39 of our subjects may be categorized as 
showing an overall extremity effect, while 21 do not. 
These results yield a chi-square value of 4.82 (P< 
05), when compared with the null hypothesis of a 
30-30 split, 
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It is interesting to consider some of the 
factors which contribute to inconsistency of 
response within a given category of triples. 
A major problem is the fact that in com- 
paring the subject’s response to the most and 
least favorable items in each set, we did not 
consider the individuals interpretation of 
these items, but instead relied on normative 
group data. As a result, there were doubtless 
some cases in which the subject’s reaction to 
a given triple was misclassified because of 
our failure to consider his particular ranking 
of the items along the favorability dimension. 
Another problem results from the spacing of 
the items within each triple. Anderson and 
Jacobson (1965) have shown that in forming 
impressions of people based on sets of three 
trait adjectives, subjects generally discount 
the adjective that is inconsistent with the 
other two, unless given special instructions. 
For example, given two positive traits and 
one negative, people generally place less 
weight on the negative trait in arriving at 
an overall rating jor likability. While the 
present study did not include any diametri- 
cally opposed items within the same triple, 
it is quite possible that similar processes were 
operative, in which case the subject's response 
to some of the triples may have been dic- 
tated by the relative inconsistencies of the 
items, rather than by their extremeness. 

A word is perhaps in order concerning the 
absence of any attitude effects in Experi- 
ments I and III. In reviewing previous 
studies in this area, we had developed the 
tentative hypothesis that judgments might 
be more clearly affected by attitude when the 
subjects were presented with rather extended 
persuasive messages, rather than brief single 
statements; this conjecture was somewhat 
strengthened by the recognition that the 
more complex messages might possibly be 
more ambiguous than their simpler counter- 
parts. Experiments I and III thus repre- 
sented an attempt to test this hypothesis by 
systematically increasing the amount and 
complexity of the stimulus inputs that were 
provided to the judges. While the results 
failed to confirm our prediction, it is inter- 
esting to note that previous studies conducted 
with extended messages (Manis, 1960, 1961) 
have yielded clear evidence of attitude-re- 
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lated distortion using the same topic (fra- 
ternities) and subject population (college 
students) as were employed in Experiments 
I and III. While it is hardly conclusive, our 
failure to replicate these earlier results with 
either the item pairs or triples may mean 
that the present studies did not offer a suffi- 
cient range of stimulus complexity to test our 
hypothesis in an adequate fashion; perhaps 
even the triples of Experiment III presented 
the judges with a far simpler task than that 
afforded by an extended persuasive message. 
On the other hand, previous investigators 
have often reported significant attitude ef- 
fects in studies using simple stimulus ma- 
terials, much like the single statements that 
we employed (Sherif & Hovland, 1961). The 
discrepancy here is not readily interpretable. 


A Model for Evaluating Compound Stimuli 


The present series of studies is quite con- 
sistent in showing that compound judgments 
are typically more extreme than the average 
of their constituents. While the Osgood-Tan- 
nenbaum congruity approach and Fishbein’s 
summation model both conform to certain 
aspects of the data, these theories seem defi- 
cient in several important respects. One 
problem shared by both models is the assump- 
tion that neutral elements do not effect the 
evaluation of the compounds in which they 
appear, Results obtained in Experiment I 
clearly contradict this assumption, in show- 
ing that compounds which include neutral 
elements tend to receive less extreme ratings 
than those associated with their individual 
polarized constituents. 

The congruity principle has still another 
weakness in failing to predict that compound 
stimuli will, under certain circumstances, re- 
ceive more extreme ratings than any of their 
constituents; as noted in Experiment I, this 
is most likely to occur when the elements 
are homogeneous, or at least close together, 
with respect to the evaluative dimension. 
While the summation model is admirably 
suited to this aspect of the data, it is defi- 
cient in predicting that compounds will 
always receive more extreme ratings than 
their constituents, providing only that the 
constituent elements are all of the same sign. 
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In general, an appropriately modified av- 
eraging approach seems most suitable for the 
present data, since in both Experiments I 
and II, the pairs were quite frequently per- 
ceived at a point somewhere between the 
constituent elements. A simple version of | 
this approach might take the form (read ~ 
as “may be approximated by"): 


n 


E wS: 
E, = et 

x wi 

i= 


where: 


E. = the evaluation of the compound 
w, = the weight associated with the ith element 
Sı = the scale value of the ith element. 


While all elements are given some weight | 
greater than 0, in order to account for the ` 
extremity effect we would have to assume 4 | 
direct relationship between the polarity of 
each element and the weight which it Te | 
ceives, It should be noted that this equation 
leads to the prediction that a complex stimu- 
lus whose elements are heterogeneous 1n 
value, will receive a more extreme evaluation 
than a homogeneous compound whose ele- 
ments have an identical (unweighted) av- 
erage. While the present studies did not 
yield any clear evidence in support of this 
prediction, such an effect has been report 
by Podell and Podell (1963). 

The importance of extreme elements in the 
present experiments may be due to the fact 
that the average subject has considerable | 
subjective confidence in his ability to infer 
the significance of these items. A preliminary 
study using a different set of items indicated 
that there was a very systematic relationship 
between the extremity of an opinion item am 
the subject's confidence that he had evalu 
ated the item correctly. This leads to the 
plausible hypothesis that in judging a com: 
pound stimulus, the average subject will be 
most strikingly affected by those elements 
that appear to be unambiguous in their SIS 
nificance. 

Although the weighted average approach 
fits the present data reasonably well, it has 
several deficiencies (along with other averag 
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ing approaches) which result from a failure 
«to consider “number of elements" per se as a 
determinant of judgment. 

1. As noted above, results from Experiment 
I replicate earlier findings in showing that 
homogeneous pairs typically receive judg- 
ments which are more polarized than their 
individual constituents. A related phenome- 
non is the finding that evaluative judgments 
of compound stimuli generally become more 
extreme as we increase the number of “like- 
signed” elements (Anderson, 1965; Fishbein 
& Hunter, 1964a, 1964b; Willis, 1960). 
These observations testify to the importance 
of “numerosity” in determining complex so- 
cial judgments, 

2. Granting that, other things being equal, 
the judge’s response becomes more polarized 
as we increase number of elements that he 
must consider, it seems reasonable to as- 
sume that as we continue to increase the 
number of elements, the magnitude of this 
effect should diminish, For example, while 
two elements may yield a more extreme 
response than one, it is unlikely that there 
will be much change in response as we move 
from 11 elements to 12, Ultimately, as the 
number of contributing elements is further 
increased, we anticipate that we will approach 
the limiting case in which the inclusion of 
additional elements has no effect. 

In considering various mathematical tech- 
niques which might modify the weighted 
average in a manner suitable to these con- 
siderations, an attractive possibility was the 
simple device of multiplying our weighted 
average by a function which varied loga- 
rithmically with the number of contributing 
elements. Thus, our earlier equation may be 
modified as follows: 


^ 
YS wisi 
í 


LÀ 
Dw 
i=l 


E. = logn 


This formulation has several important 
properties: 

1. Assuming that through the choice of an 
appropriate base our logarithmic multiplier is 
always set at a value greater than 1, we are 
led to predict that the evaluation of a com- 
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pound will always be more extreme than the 
simple weighted average of its elements; in 
the case of a homogeneous compound, this is 
equivalent to predicting that the compound 
will be more extreme than any of its constitu- 
ents. 

2. The overall evaluation of a set of ele- 
ments is systematically increased in polarity 
as n increases. 

3. As 7 continues to increase, the weighted 
average is multiplied by a negatively acceler- 
ating logarithmic function. Thus, assuming 
a constant weighted average, as n increases, 
there is a steady decrease in the difference 
between the predicted values for compounds 
based on n and n + 1 elements. 

To handle the problem of “compounds” 
consisting of just one element, a further 
modification may be introduced; this involves 
the multiplication of our weighted average 
by log (n + 1), rather than by log n. The log 
of 1 is O and thus if log n was used as our 
multiplier, this would result in the obviously 
erroneous prediction that all one-element 
compounds receive ratings of 0. The modified 
formula (log z 4-1) eliminates this minor 
difficulty. 

In conclusion, we should hasten to add 
that the present formulation does not consti- 
tute an appropriate model for all situations 
in which complex stimuli are evaluated. Hicks 
and Campbell (1965), for example, have 
found clear evidence of additivity when 
judges evaluated stimuli from three different 
domains (the subjective value of birthday 
gifts, the seriousness of traffic violations, and 
the degree of bizarreness implied by various 
behavior symptoms). In each case, compound 
stimuli were given more extreme ratings than 
their components. It is, of course, hardly 
surprising to find that the value of two gifts 
considered together exceeds the value of 
either one considered alone; similarly, the 
obtained additivity of traffic violations is not 
unexpected. There is, however, a striking 
contrast between the Hicks and Campbell 
results in the domain of psychopathology 
and those obtained with this same topic in 
Experiment II of the present series. While 
the basis for these divergent results is not 
readily apparent, it may be significant to 
note that the Hicks and Campbell data were 
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obtained from undergraduate students, while 
our Experiment II used clinical graduate 
students and practicing clinical psychologists 
as judges. It should also be noted that Hicks 
and Campbell’s subjects rated stimuli in all 
three of the domains listed above, and may 
thus have “carried over" an additive combi- 
natorial process from one set of stimuli to 
the next. 

There are other situations in which it is 
clear that the weighted average formulation 
is inapplicable. Consider, for example, the 
combinatorial model which the draft board 
uses in evaluating potential inductees. Here, 
after each individual has been evaluated in 
several areas, an overall judgment (accept or 
reject) is rendered, based on whether the 
individual meets minimal standards in each 
relevant domain. Note that he is not ac- 
cepted merely because his examination yields 
scores which are acceptable “on the average.” 
He is, instead, rejected if he fails any aspect 
of the physical (e.g., heart condition) even 
though he is well above average in all other 
regards. Coombs (1964) has characterized 
this as a conjunctive evaluative procedure 
and it is clearly operative in many everyday 
situations, 

Another combinatorial approach which is 
applied on occasion is a disjurictive procedure 
in which overall evaluation is determined by 
the one attribute in which the compound-to- 
be-judged rates most favorably. For example, 
in a highly specialized game such as pro- 
fessional football, it is likely that the evalua- 
tion of individual players is determined by 
their effectiveness in the one position for 
which they are best suited, rather than by 
averaging or summing their ability over sev- 
eral positions. Thus, a player may be highly 
evaluated solely on the basis of his excellent 
performance as a quarterback; moreover, this 
evaluation would not be adversely affected 
by the fact that this same player might be 
completely inadequate as a defensive tackle 
(for a fuller discussion of conjunctive and 
disjunctive models see Coombs, 1964; also 
Dawes, 1964). It seems clear that our main 
need at the present time is for more research 
concerning those variables (e.g., topic, situa- 
tion, subject characteristics) which deter- 
mine the combinatorial model that is applied 
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in the evaluation of complex social stimi 
available evidence suggests that there 
single model which can be universall; 
plied. 
It is interesting to observe that i 
above examples of conjunctive and di 
tive combinatorial rules, the input elei 
are assumed to derive from rather di 
domains. For example, in the case of 
draft board examination an attempt is 
to combine information concerning phy 
status, intellectual capacity, personality: ch 
acteristics, etc. The inputs from any one 
these sources would be compatible with: 
almost endless variety of inputs from 
remaining sources; for example, good h 
might go with high or low intelligence, 
with virtually any level of personal ad 
ment. In contrast, the inputs for the pr 
studies were all drawn from a single don 
in the sense that certain input combina 
(e.g., endorsement of highly pro- am 
fraternity statements) would have been qu 
unlikely to occur in the normal experience 
our subjects. While these observations 1 
far from conclusive, they lead to the co 
ture that conjunctive and disjunctive c 
natorial procedures may typically be app 
when the judge is forced to integrate inl 
mation from a variety of domains, whil 
eraging and additive approaches ma} 
more common when the materials-to-b 
bined are derived from a single domain. 
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EFFECT OF A FAVOR WHICH REDUCES FREEDOM * 


JACK W. BREHM ax» ANN HIMELICK COLE? 


Duke University 


The purpose was to show that when a favor reduces a person's freedom, it 
arouses “psychological reactance,” a motivational state aimed at restoration of 
this freedom. Ss, run individually, learned they were to make 1st-impression 
ratings of another S (confederate) and then were given a soft drink by this S, 


prior to making the ratings. In a 2 X 2 design, the Ist-impression ratings were 
given low or high importance, and 3 of each importance group received the 
favor, 3, not. An opportunity was presented for S to do a favor for the con- 
federate after the ratings. In the low-importance condition, the favor increased 
the likelihood that S would perform a favor in turn for the confederate, and in 
high importance, the favor reduced the likelihood that S would perform the 


return favor. 


When a person receives a favor he fre- 
quently feels that he ought to perform a 
favor in return. However, a recent theory by 
Brehm ? suggests that a favor may also create 
an opposing tendency, that is, a favor may 
arouse an individual to avoid performing a 
return favor. 

Briefly, the theory states that for a given 
individual at a given tire, there is a set of 
behaviors in which he believes he is free to 
engage. Any reduction or threat of reduction 
in that set of free behaviors arouses a moti- 
vational state, “reactance,” which is directed 
toward reestablishment of the lost or threat- 
ened freedom. If a person thought he was free 
to engage in Behaviors X, Y, and Z, and 
Behavior X were then somehow made impos- 
sible, he would experience reactance and 
would be motivated to recover his freedom to 
engage in X. Conversely, if he were “forced” 
to engage in X, his consequent reactance 
would lead him to avoid X and attempt Y 
or Z. 


1This study was conducted by the second author 
as part of her undergraduate honors work. It was 
supported in part by Grant G23928 to the first 
author from the National Science Foundation. The 
authors would like to express their gratitude to 
Norman Culbertson, who served as experimental 
confederate, and to Edward E. Jones for his helpful 
advice on the analysis of results and comments on 
this manuscript. 

?Now at Manhattanville Hamilton Grange Neigh- 
borhood Conservation, New York, New York. 

3 “Psychological Reactions to Choice Reduction,” 
application for research grant to the National Sci- 
ence Foundation, January 1962. 


The amount of reactance experienced from i 
any given threat to, or reduction of, free- | 
dom is a direct function of how important it 
is to the individual to have that freedom. | 
Thus, the more important is the freedom of 
the behavior threatened or actually elimi- | 
nated, the more will the individual attempt to 
reestablish it as free. | 

For example, suppose Mr. Smith ap | 
proached a vending machine which sold 
candy bars of Brands X, Y, and Z. Suppose 
further that Smith occasionally purchased | 
each of these three brands, but that on this | 
occasion he intended to take Brand X. Nor- | 
mally, of course, one inserts the proper COM | 
in a slot and then pushes a lever to select 
which kind of bar he wants. But suppose that | 
this time when Smith inserted the prope 
coin the machine simply dispatched a bar 0 
Brand X without waiting for Smith to push 
the appropriate lever. Because his implicit 
choice could not possibly have determin 
the machine’s selection, Smith’s freedom has 
been reduced and he should experience reat- 
tance. The amount of reactance would be 
directly proportional to the importance of his 
being free to choose Brands Y or Z, and since 
he does occasionally select these other brands; 
this freedom would have some importance to 
him. Smith could reduce his reactance A 
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selecting Y or Z, although since he cannot 
trade X back into the machine, he would 
have to insert another coin to accomplish 
this mode of reducing reactance. In addition, 
since Smith is now motivated to avoid X (4 
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well as gain Y or Z), he should now think 

-^Brand X somewhat less attractive than he 
had initially thought, and think Brands Y 
and Z somewhat more attractive. 

In interpersonal relations, a favor tends to 
put pressure on the favored person to return 
the favor. This pressure to return the favor 
is a threat to the freedom of the favored 
person in his relations with the favorer. Thus, 
the favored person should experience reac- 
tance, and the amount of reactance he ex- 
periences is a direct function of how impor- 
tant it is to him to be free of such pressure. 
He can reduce this reactance by *being free," 
that is, by not acting as though he were under 
some pressure to perform a favor in return. 

Suppose, for example, that Mr. Smith has 
just hired a new receptionist and that all he 
expects of her is that she be at her desk to 
receive people cordially. Since this is her 
task, it is important for Smith to be able to 
evaluate her performance in this regard. That 
is, it is important that he be free of irrelevant 
pressures which might tend to influence his 
decision to continue employing her. Thus he 
would not want to learn that she was the sole 
supporter of her sick mother. Similarly, if she 
spontaneously brought him a piece of home- 
made cake, he would tend to feel irrelevant 
pressure to make a positive evaluation of her 
no matter how much he liked the cake and no 
matter how benign he thought her intentions. 
Since this pressure would threaten his free- 
dom to evaluate her solely on the basis of her 
relevant performance, he would experience 
reactance, To reduce the reactance, he would 
try to eliminate any feeling of obligation 
toward her, and consequently, would appear 
to be unappreciative of her gift and would 
avoid performing any kind of return favor. 

The hypothesis to be tested, then, is that 
a favor arouses reactance in the favored per- 
son in direct proportion to how important it 
is to the favored person to be free of any 
pressures created by the favor. This reac- 
tance will tend to result in refusal to perform 
a favor, subsequently, for the favorer. 


METHOD 


In order to test the hypothesis, male introductory 
psychology students, run individually, were told 
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they were to make first-impression ratings of an- 
other subject. A confederate of the experimenter, 
posing as the subject who was to be rated, gave the 
subject a soft drink (Coca-Cola or Pepsi Cola) 
prior to the rating procedure. Subjects were then 
asked to rate the confederate on several dimensions, 
and they were also given an explicit opportunity to 
perform a favor for the confederate. Instructions 
were designed to vary how important it was to the 
subject to be free in regard to the confederate, and 
control conditions in which no soft drink was 
given served to assess “normal” ratings of the con- 
federate and “normal” tendencies to perform the 
favor for him. 


Procedure 


Male introductory psychology students volun- 
teered to participate in a study of projective testing 
techniques, and they were given the impression by 
the sign-up sheet that subjects were being scheduled 
two-at-a-time for each experimental session. If a 
subject arrived prior to 5 minutes before the sched- 
uled time, he found the door to the experimental 
room closed and a note which instructed him to sit 
and wait in one of the chairs at the door. At about 
5 minutes before the scheduled starting time, an- 
other male student, actually a confederate of the 
experimenter, arrived, read the note, and sat down 
in one of the chairs. If the subject arrived after the 
confederate, he found the confederate sitting there. 
As soon as both confederate and subject were seated 
outside the experimental room, the experimenter 
came out and explained that there would be a 5-10 
minute wait before the experiment began. At this 
point the confederate asked if he could leave for 
a few minutes and the experimenter gave her con- 
sent. She then sat down beside the subject and 
in a conversational tone explained to him that 
another person had asked her to collect some infor- 
mation on first impressions, though these first im- 
pression ratings had nothing to do with her own 
study on projective techniques, She told the sub- 
ject he was to form his first impression solely on 
the basis of the oher subject’s (confederate’s) re- 
sponses to three standard questions. The experi- 
menter added that the subject should therefore 
eliminate from his mind any incidental interaction 
that might occur between the other subject and 
himself. 

Importance manipulation. The importance manipu- 
lation was then introduced. Half the subjects were 
run under low-importance instructions and half 
under high-importance instructions. The intent of 
this manipulation was to vary the extent to which 
subjects felt it was important to follow the ex- 
perimenter’s instructions of making ratings solely 
on the basis of the confederate’s answers to the 
standard questions. To the extent these instructions 
seemed important to the subject, he should also 
have felt it was important for him to be free of any 
irrelevant factors which might also influence his 
perception of the confederate. Specifically, then, the 
importance of the freedom threatened by the favor 
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should covary with the importance of making ac- 
curate ratings. 

To establish low importance, the experimenter 
stated that the person for whom she was collecting 
the first-impression ratings was an undergraduate 
student in sociology, She also told the subject that 
he need not be too concerned with being careful or 
accurate on the rating scale since it was merely a 
class project on which the sociology student was 
practicing, 

To establish high importance, the experimenter 
stated that the person for whom she was collect- 
ing the first-impression ratings was Professor Ter- 
rell and that he had just received a large grant 
from the National Science Foundation to support 
his research. She explained that the object of the 
first-impression rating scale was to test how well 
the subject would judge another person on first 
impression and that Dr. Terrell’s research had 
shown that the people who made the most accurate 
first-impression judgments of others would succeed 
(in life). She then stated that Dr. Terrell’s test had 
been standardized on the basis of responses to the 
three standard questions (which the subject would 
hear before filling out the rating scale) and that 
the subject’s test results would be completely mean- 
ingless unless he formed his first impression of the 
other subject solely on the basis of his answers to 
the three questions. The experimenter said that the 
test was very important to De. Terrell and that she 
hoped the subject would be as careful and accurate 
as he could be. Finally, she took the subject’s name 
and address so that his test results could be mailed 
to him. To support further the high importance 
manipulation, a title page appeared over the three- 
page first impression rating scale which subjects later 
received inside the experimental room. This page 
titled the rating scale “The Terrell Success-Failure 
Test” and also stated that the test was copyrighted 
by the National Science Foundation. 

Favor manipulation, The experimenter returned to 
the experimental room and left the subject sitting 
alone. In a few minutes, the confederate returned 
and introduced the favor manipulation. Half the 
subjects received a favor and half received no favor. 

In the favor situation, the confederate returned 
with one soft drink which he immediately gave to 
the subject. The confederate refused any money 
offered for the soft drink. In the no-favor situa- 
tion the confederate simply came back and seated 
himself by the subject. 

The experimenter immediately opened the door, 
invited both “subjects” into the experimental room, 
and indicated that they should seat themselves at 
opposite ends of a small table. A removable shield 
separated the two so that although they could see 
each other, neither could observe what the other 
was writing. The experimenter explained briefly in 
all conditions that they were to make first-impres- 
sion ratings of each other on the basis of responses 
to three questions which each in turn would have 
an opportunity to answer aloud, and she noted that 
the ratings were of high (or low) importance. The 
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experimenter alternately designated the confederate 
or the subject to answer the first question first,- 
vice versa for the second question, and the same 
order as the first question for the third. The three 
questions were: “If you could travel any place in 
the world all expenses paid, where would you choose 
and why?” “What kind of woman do you want to 
marry?” “What occupational field do you intend to 
enter when you finish your schooling and what are 
your reasons for this choice?” 

The confederate’s answers to the three questions 
were always exactly the same and were designed to 
present a relatively noncommital, though plausible 
picture. In answer to the three questions the con- 
federate stated that he would like to travel to 
Western Europe; that he would like to marry a 
physically attractive, fairly intelligent woman who 
was also a good homemaker and party girl; and 
that he intended to enter the field of electrical engi- 
neering for financial reasons, to satisfy his own 
personal ideals, and to help other people. 

After the questions had been answered the ex- 
perimenter gave each a first-impression rating scale 
(to be described below), assured them of anonym- 
ity, and reminded them to make their ratings of 
each other on the basis of the three questions. 

Measure of tendency to return favor. When the 
questionnaires were completed, the experimenter 
removed the shield from between the two “subjects” 
so that they were now seated facing each other at 
opposite ends of a small table. She then picked up 
a stack of 8.5 X 11 typing paper, placed it in front 
of the confederate and looking at him said, “Will 
one of you stack these papers into 10 piles of five 
for me please?” She then walked away, seated her- 
self at some distance, and leafed through the pro- 
jective test materials, although actually she could, 
unnoticed, watch the subject. As soon as the papers 
were placed in front of the confederate, he began 
stacking them. The experimenter recorded the 
number of piles of paper stacked by the confederate 
before the subject started to help, if he helped at 
all. 

When the papers were stacked, the experimenter 
asked the confederate to take them to a secretary In 
a nearby office. After the confederate left, the 
subject was asked a series of increasingly directive 
questions in an attempt to ascertain whether he was 
in any way suspicious of the experimental pro- 
cedure. Finally, the experiment and its purpose were 
fully explained and the subject was asked not to 
tell other students. 

Experimental rating scale. The rating scale pro- 
vided a means of measuring the effects of reactance 
on perceptions of the confederate, and also pro- 
vided postexperimental checks on each of the ex- 
perimental manipulations. It consisted of three 
parts. Part 1 asked subjects to place a check in 1 
of 10 unlabeled boxes between pairs of antonyms ta 
indicate their perceptions of the confederate. There 
were 11 items: friendly-unfriendly, mature-imma- 
ture, intelligent-unintelligent, considerate-inconsid- 
erate, deep-shallow, straightforward-devious, inter- 
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esting-uninteresting, kind-unkind, genuine-affected, 

.unannoying-annoying, socially competent-socially in- 
competent. Part 2 was identical with Part 1, except 
that the subject was asked to rate himself rather 
than the confederate Part 3 consisted of nine 
questions in answer to which the subject was to 
place a check mark anywhere on a 3-inch blank 
rating scale marked “not at all” on one end and 
“very much” on the other end. Four questions dealt 
with the subject’s tendency to approach or to avoid 
the confederate in the future: “How much do you 
like the other person?” “To what extent would you 
like to participate in further research with the other 
person?” “To what extent would you like to be 
with the other person socially?” “To what extent 
would you like to have the other person for a 
close friend?” Questions included as postexperimental 
checks on the favor manipulation and on the im- 
portance manipulation will be described under 
Results. 


Subjects 


A total of 77 male introductory psychology stu- 
dents were used as subjects, but 17 of these were 
eliminated by the application of criteria determined 
before the experiment was run. Six could not pos- 
sibly be used because they were good friends of the 
confederate, brought a soft drink for themselves, 
etc. Seven of the remaining 11 were suspicious, 3 
refused to accept the drink, and 1 did not want his 
test results on the first-impression ratings. Inclusion 
of the data for these 11 subjects in the total results 
would have no effect on the essential outcome and 
conclusions which may be drawn. 


Experimental Confederate 


The confederate was a junior undergraduate en- 
gineering student, paid for his help in the experi- 
ment. He seemed to be an average university stu- 
dent, not outstanding in any particular way which 
would afiect the results. He was trained to stand- 
ardize his behavior during 28 pretest sessions. 


Selection of the Favor and the Opportunity 
to Return the Favor 


In order that the favor create pressure on the 
subject to'act favorably toward the confederate, it 
had to be something “nicer” than generally occurs 
between total strangers, but not so nice as to 
arouse suspicion about the motives of the confed- 
erate, Thus it could not be the offer of a cigarette 
when the confederate was about to light one for him- 
self, because such an offer is more expected than 


4The self-ratings were obtained as a possible 
control for individual differences in usage of the 
rating scales. They were not needed for this purpose 
since there was relatively little variability in the 
ratings for the confederate. Because there was no 
theoretical reason for obtaining the self-ratings, and 
since they do not seem to clarify the phenomena 
under study, they are not reported. 
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not. Nor, on the other hand, could it be the presen- 
tation of a gold watch since that would create the 
impression the confederate was either crazy or 
perhaps up to something illegal or immoral. Pre- 
testing convinced us that the giving of a soft drink 
would probably fit our needs. 

In order to avoid the perception that the favor 
was a direct appeal to receive positive first-im- 
pression ratings, the confederate was allowed to walk 
away before the impression-formation task was 
mentioned by the experimenter. 

Originally, the confederate returned with a drink 
for himself as well as one for the subject. But some 
of the pretest subjects felt the gift was not suf- 
ficiently unusual in this form, and that it would 
create more pressure if the confederate did not have 
one for himself. 

The opportunity to return the favor was also 
selected through pretesting. Initially, the experi- 
menter asked the confederate to move a heavy box 
but this opportunity had so much demand character 
that all subjects offered to help. Paper stacking 
was then tried and found to be appropriate in that 
about half of the pretest subjects offered to help. 


Summary of Design 


A 2X2 design was used in which importance of 
being free in regard to the confederate constituted 
one independent varjable, and favor or no favor 
constituted the other. Perceptions and motivations 
in regard to the confederate were measured on rat- 
ing scales, and an opportunity to return the favor 
constituted a behavioral measure. Sixty subjects in 
all, 15 in each of the four experimental conditions, 
provided data for the experimental analysis. 


RESULTS 


Answers to the questions intended to check 
the success of the manipulations were scored 
on a 12-category scale with 1 = “not at all” 
and 12 = “very much.” The mean answer to 
the question, “How important do you per- 
sonally feel these ratings are?” was 8.50 in 
the high-importance condition and 5.87 in 
the low, the difference being significant beyond 
the .01 level. This difference was about equal 
in the favor and no-favor conditions. In an- 
swer to the question, “To what extent have 
you tried to make these ratings completely 
accurate?" the high-importance subjects 
scored 11.07, and the low, 10.43. Even 
though accuracy was stressed only. in the high- 
importance condition, the difference between 
the conditions on this measure is not reliable. 
However, since subjects in the high-import- 
ance conditions report the ratings as per- 


5 AID statistical tests are two-tailed. 
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TABLE 1 
NUMBER OF SusJecrs WHO HELPED THE 
CONFEDERATE STACK PAPERS 
Low importance | High importance 
No favor 
Helped 9 7 
Did not 6 8 
Favor 
Helped 14 2 
Did not 1 13 


sonally more important than do those in the 
low, we may conclude the manipulation of 
importance was successful. 

On the question, *To what extent would 
you go out of your way to do something nice 
for the other person?" subjects in the favor 
conditions gave a mean answer of 9.20, those 
in the no-favor conditions, 7.93, the difference 
being reliable (5 < .05). On the question, 
“To what extent would you feel obligated to 
do something nice for the other person?" 
subjects in the favor conditions gave a mean 
answer of 8.37, subjects in the no favor, 5.00, 
this difference also being reliable (p « .01). 
These data indicate that the favor had the 
intended effect of creating pressure on sub- 
jects to perform a favor for the confederate.* 


Performance of the Favor for the Confederate. 


While a favor will ordinarily create pressure 
on the favored person to perform a favor in 
return, as seen in the above questionnaire 
check on the favor manipulation, it was ex- 
pected that the favor would also arouse re- 
actance which would create pressure to avoid 
performance of a return favor. Since the 
magnitude of reactance is expected to be 
greater in the high-importance condition than 
in the low, the tendency of subjects to avoid 
performance of the favor for the confederate 
Should be greater in the high-importance 
condition than in the low. 


9 We have assumed that these questions tap the 
direct pressure of the favor more than any counter- 
pressure from reactance. However, it would not have 
been surprising if subjects in the high-importance- 
favor condition had reported less desire to do some- 
thing nice, or less feeling of obligation, than sub- 
jects in the low-importance-favor condition. The 
fact that this did not occur is somewhat puzzling 
and will be taken up in the Discussion. 
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The no-favor conditions show the strength ¢ 


of subjects’ tendency to help the confederate’ 
stack papers in the absence of his having done 
them a favor, and they also show whether or 
not there is any effect of the importance 
manipulation on this tendency. | 

Table 1 shows the number of subjects in 
each condition who did or did not help to 
stack papers. As intended, about half (16 of 
30) of the subjects in the no-favor conditions 
helped. This proportion was planned since 
it allows the detection of both increased and 
decreased tendencies to perform the favor in 
the favor conditions. It should also be noted 
that the slight difference between high- and 
low-importance conditions is well within 
chance variation and is consistent with the 
assumption that the importance manipulation 
per se had no effect on the tendency to help 
stack papers. 

The behavior of subjects in the favor 
conditions, however, is strikingly different 
from the behavior of those in the no-favor 
conditions. In the low-importance condition, 
where reactance is presumably low, 14 of the 
15 subjects helped stack papers. This indicates 
that the favor created a strong pressure on 
subjects to perform a return favor, as €x- 
pected. But in the high-importance condition, 
only 2 of the 15 subjects helped stack papers. 
Clearly in the high-importance condition, 
there is not only some resistance to helping 
the confederate, there is a strong motivation 
to avoid helping the confederate. Thus, these 
data are consistent with the hypothesis that 
reactance is aroused by a favor when it is 
important to be free of pressures created by 
favors, and this reactance leads to avoidance 
of doing a return favor. 


Ratings of the Confederate 


While the ratings have no instrumental 
value for the reduction of reactance, it we 
expected that they would reflect negative 
feelings toward the confederate and a tendency 
to “bend over backwards" not to make post- 
tive ratings of him in the high-importance- 
favor condition. However, this effect ap- 
peared only for the adjective rating “friendly- 
unfriendly,” on which the confederate was 
rated more positively in the favor than in the 
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no-favor condition under low importance. 


This increase in perceived friendliness of the 
confederate in the favor condition does not 
occur where importance was high, the inter- 
action among the four conditions being sig- 
nificant by analysis of variance at the .05 
level. Thus, the favor fails to make the con- 


. federate appear more friendly when it also 


arouses reactance. 

Differences on other items, however, are 
confined to the effect of the importance ma- 
nipulation, the effect being that the con- 
federate is rated somewhat less positively in 
the high- than in the low-importance con- 
dition on “interesting-uninteresting,” “socially 
competent-incompetent,” and “straight for- 
ward-devious,” particularly in the no-favor 
condition. No interaction is significant and it 
therefore appears that the high-importance 
instructions simply lead subjects to rate the 
confederate somewhat more conservatively. A 
similar but weaker trend appears in the self- 
rating data. No differences appeared on the 
approach-avoidance items. 


DISCUSSION 


It was hypothesized that a favor which 
reduces a person's freedom arouses reactance 
and a consequent desire to be free in regard to 
the favorer. Thus, when a person experiences 
relatively great reactance from receiving à 
favor, he will tend to avoid performing a 
return favor even when there is a clear op- 
portunity to do so. This reasoning is sup- 
ported rather well by the results of the present 
experiment. Where the importance of being 
free in regard to the confederate was rela- 
tively low, the favor increased the tendency 
of subjects to perform the return favor. But 
where the importance of being free was rela- 
tively great, the favor decreased subjects’ 
tendency to perform the return favor. 

. It might be thought that the confederate 
inadvertently behaved differently in the dif- 
ferent experimental conditions, and that such 
differences in behavior account for the ob- 
tained effects. However, in addition to the 
fact that the confederate was well trained 
Prior to the beginning of the experiment, 
differences in the confederate’s behavior strong 
enough to affect the performance of the return 
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favor should also have affected the ratings. 
But as we have seen, there were no differences 
as a function of the favor manipulation ex- 
cept on “friendliness.” In particular, there 
were no differences on the relevant items, 
“straightforward-devious,” “genuine-affected,” 
and “socially competent-socially incompetent.” 
Thus, it does not seem plausible to account 
for the results by supposing that the con- 
federate behaved differently prior to the 
ratings. 

Neither is it likely that subjects were trying 
to show the experimenter they were unaffected 
by having received the soft drink, since after 
the experimenter placed the paper in front 
of the confederate, she clearly removed her- 
self from the experimental table and proceeded 
to leaf through the projective testing ma- 
terials, apparently not paying attention to 
the stacking. In addition, subjects in the 
high-importance-favor condition showed no 
greater desire to make accurate ratings 
(11.00) than those in the high-importance— 
no-favor condition, (11.13) although they 
presumably would have if they had been 
trying to show the experimenter they were 
unaffected by the favor. 

A problem for which we have no completely 
satisfactory solution is the discrepancy be- 
tween overt and verbal behavior: in the 
high-importance-favor condition, subjects re- 
ported a relatively strong desire and obliga- 
tion to do something for the confederate al- 
though when actually given the opportunity 
to do something, only 2 out of 15 did. The 
overt behavior, of course, is consistent with 
expectation. The question centers, then, on 
why subjects said they wanted to do some- 
thing for the confederate. 

Our best guess is that subjects were trying 
to be as objective as possible in rating the 
confederate. In other words, the subjects 
distinguished between the fact that they had 
received a favor from the confederate and 
consequently owed him something in return, 
and their experiencing reactance and the 
consequent disinclination actually to do any- 
thing for him. Subjects attempted to make 
the ratings objective by basing them on the 
"objective" situation. On the other hand, the 
stacking of papers apparently had nothing to 
do with the rating task (or any formal aspect 
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of research) and therefore more clearly re- 
vealed the subjects’ true motivational state 
in regard to the confederate. 

If we are correct in our interpretation that 
the favor aroused a motivational state counter 
to the normal pressure it creates because it 
reduced the individual’s freedom, then a way 
is opened for the understanding of a variety 
of interpersonal phenomena, For while a 
favor is a specific type of inducing force, the 
theoretical explanation is more general. The 
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critical question is not what type of stin 
is used to place pressure on a person to 
form a given behavior, but rather to- 
extent the resultant pressure tends to re 
or threaten to reduce the individual’s free 
When specifiable freedoms are reduced 
reduction is threatened, it can be prei 
that the individual will experience reac 
and will attempt to regain the lost 
threatened freedom. 


(Received December 29, 1964) 
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~ LEARNING VERBAL REFERENTS OF PHONETIC SYMBOLS 
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3 learning experiments were conducted in order to determine whether phonetic 
symbolism has a coercive influence upon learning. 3 hypotheses were tested. 
It was found that: (a) pairs of phonetic symbols and congruent referents 
were learned more easily than phonetic symbols paired with noncongruent or 
unrelated referents, (b) the errors made in learning noncongruent pairs were 
predominantly the congruent referent of those symbols rather than another 
noncongruent referent, and (c) pairs which were congruent or noncongruent 
were easier to learn than pairs which were unrelated in terms of phonetic 


symbolism. 


Two major theories, both at least as old 
as Plato’s Kratylus, may be distinguished 
with respect to the origin of language or 
how words acquire meaning. One view main- 
tains that the initial process involves the 
formation of gesture-like words which natu- 
rally imitate or re-present their referents 
(e.g., Johannesson, 1944; Paget, 1930). The 
other view argues that words acquire signifi- 
cance by convention just like labels which 
are arbitrarily assigned to things (eg, 
Brown, 1958; Thorndike, 1946). 

Proponents of the first view, namely, natu- 
ral reference, have long pointed to evidence 
of consensual comprehension of phonetic 
symbols as support for their position. Positive 
findings from three types of matching experi- 
ments are cited, In the first type of experi- 
ment, subjects are presented with differential 
sets of nonverbal patterns which they are to 
match, for example, TAKETE and ULOOMU 


with Ae and PA (eg, Davis, 1961; 


Kühler, 1947). The second general paradigm 
derives from Sapir’s (1929) technique. It 
involves the use of nonverbal patterns as new 
names for verbal referents, for example, sub- 
jects match a given phonetic pattern, such 
as ZAH, with one of a dimension of reference, 
such as red, blue, green, and yellow (eg. 


1The authors wish to express their appreciation 
to G. Keppel for his critical reading of the manu- 
Script and suggestions. They also wish to thank 
M. Hollingsworth for her assistance. 


Langer & Rosenberg, 1964; Rosenberg, 
Badia, & Langer, 1964). Finally in a third 
variety of these experiments subjects match 
sets of words from different languages, one or 
both of which are not familiar to the subject 
(e.g., Miron, 1961; Tsuru & Fries, 1933). 

Proponents of cofiventional reference ques- 
tion the validity of these findings because the 
method used is always a matching task. As 
early as 1933, Bentley and Varon presented 
negative evidence to support the contention 
that both the ability of subjects to match 
phonetic patterns with referents and the high 
intersubject consistency are artifacts of the 
method, that is, the result of experimenters 
indicating to their subjects the potential util- 
ity of the phonetic patterns as symbols for 
the reference choices presented. 

A particularly strong version of this argu- 
ment has been made more recently by Brown 
(1958). It is also based upon negative re- 
sults, that is, the inability to produce behav- 
ioral consequences of phonetic symbolism, in 
learning experiments, when their potential 
representational function is not directly re- 
quested by the experimenter. Since theories 
of natural reference should expect positive 
behavioral consequences, Brown concluded 
that neither phonetic symbolism nor any 
other form of natural reference can play a 
functional role in the formation of ordinary 
language, with the possible exception of 
poetry. Consequently, he questioned repre- 
sentational or imitative theories of the origin 
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of language. Brown asserted, on this basis, 
that the function of language is *conventional 
reference" and that the sign-significance rela- 
tionships between words and things are the 
result of arbitrary associations subject to the 
laws of learning, that is, contiguity and 
reinforcement, 

The inability to produce by experimental 
methods positive behavioral consequences of 
phonetic symbolism seriously questions the 
assertion that at first symbolic forms, such as 
phonetic symbols, naturally represent or imi- 
tate referents. The primary purpose of the 
present study was therefore to determine 
whether behavioral consequences of phonetic 
symbols could be obtained in learning experi- 
ments. To this end phonetic symbols ob- 
tained in a previous study (Langer & Rosen- 
berg, 1964) were utilized. These symbols do 
not seem to violate any requirements which 
Brown might maintain since they were com- 
plete syllables spontaneously produced by the 
subjects requested to construct new names to 
represent color and spatial concepts, and sig- 
nificant consistencies obtained among an 
independent group of subjects when required 
to choose a color (red, blue, green, or yel- 
low), an evaluation (virtuous, good, neutral, 
bad, or evil), etc., which was most fittingly 
represented by each syllable. Three types of 
pairs were constructed in order to perform 
the experiments: congruent, that is, pairs 
which the previous group of subjects had 
consistently matched to each other, such as 
ZAH-RED; noncongruent pairs, that is, the 
same symbol paired with a referent to which 
it had not been consistently matched but of 
the same referent dimension, such as ZAH- 
BLUE; and in Experiment I unrelated, that 
is, pairs composed of a symbol, for which 
equiproportional distribution of choices ob- 
tained on the color dimension, and a ran- 
domly assigned referent, such as KOF-RED. 

Three hypotheses were tested. As stated 
above, the theory of natural reference as- 
serts that phonetic symbols represent their 
referents. Hypothesis 1: Congruent pairs are 
expected to be learned more rapidly than non- 
congruent pairs. 

According to the recent organismic-de- 
velopmental formulation of the theory of 
natural reference by Werner and Kaplan 
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(1963), the symbolic relationship between. 
phonetic patterns and referents is the result 
of a continuous process of matching them to 
each other until they naturally fit for the 
subject. Since noncongruent pairs, as com- 
pared with congruent, should not naturally | 
fit for the subject it is expected that they 
will tend to reconstruct the pairs so that the "| 
referent responses will fit the symbols. 
Hypothesis 2: More “errors” are expected 
for the noncongruent pairs than congruent 
pairs. The “errors” for the noncongruent 
pairs are expected to be congruent referents, 
that is, the predominant choices of the 
normative group, rather than other non- 
congruent referents. For example, the non- 
congruent pair ZAH-BLUE should yield more 
“errors” than zAH-RED. The “errors” for 
ZAH-BLUE are expected to be predominantly 
red which was the consistent choice of the 
normative group, rather than green which 
was not a favored choice. 

Finally, the theory of natural reference 
maintains that phonetic symbols and refer- 
ents, whether similar or antagonistic to each 
other, are nevertheless meaningfully related. 
They should therefore be learned more easily 
than unrelated pairs, in a fashion similar to 
ordinary words which are learned more rap- 
idly when paired with their synonyms of 
antonyms than when they are paired with 
unrelated words (McGeoch & Irion, 1932; 
Osgood, 1953). Hypothesis 3: Congruent and 
noncongruent pairs will be learned more easily 
than unrelated pairs. 

In order to maximize the generality of | 
the findings to be obtained, several dimen- 
sions of referent responses (color, evalu- 
ative, simple space, and direction) and mix 
and unmixed lists (Twedt & Underwood, | 
1959) were utilized. Mixed lists permit the 
subjects to focus upon the easier pairs first 
and then to learn the more difficult pair 
That is, it is possible that the effect may 
not be symmetrical but primarily the res 
of the congruent pairs. Consequently W- 
mixed lists, which preclude this possibility, 
were also used since this does not permit E 
subjects to selectively focus on congruent 0 
noncongruent pairs for learning. 

Finally, it was possible that pairs whos 
stimulus and response members have common 
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letters would be learned more easily. This 
^ would confound any interpretation of the 
present findings in terms of phonetic symbol- 
ism. Accordingly, a control experiment was 
conducted in which more noncongruent than 
congruent pairs had letter commonality 
(since it was not possible to construct lists 
that included exactly equal number of pairs 
with letter commonality). This seemed to be 
an appropriate experimental strategy since 
the effect would be to work against the 
hypothesis that congruent pairs will be 
learned more easily than noncongruent pairs. 
Any positive findings would therefore consti- 
tute strong confirmation of the hypothesis, 


EXPERIMENT I: Mrxep Lists 
Method 


Lists. Two paired-associate lists were constructed. 
Each contained congruent, noncongruent, and un- 
related pairs. Selection of pairs was based upon a 
prior assessment procedure by an independent group 
of undergraduate subjects at the University of 
California (Langer & Rosenberg, 1964). Although an 
audio-visual method was used in the normative 
study, a visual method of presentation was em- 
ployed in the present experiments since it has been 
established (Brackbill & Little, 1957; Brown, Black, 
& Horowitz, 1955) that the method of presentation 
is not a significant variable. 

Congruent pairs were defined as those for which 
high consistency in matching phonetic symbols to 
colors had been obtained. The three highest for the 
colors red, blue, and green were utilized. Non- 
congruent pairs were formed by randomly assigning 
each sound to one of the two other colors, but in 
such a way that three red, three blue, and three green 
Pairs were used. Consequently each stimulus appeared 
on both lists, but on one it was paired with its 
congruent color while on the other with a non- 
congruent color. Unrelated pairs were defined as 
those for which equiproportional distribution of 
Color choices had been obtained. Each sound was 
randomly assigned a color: one red, one blue, and 
One green unrelated pair was used. These three 
unrelated pairs appeared on both lists. 

"Thus each list consisted of 12 pairs. The stimuli 
Were the same on both lists. In addition, both lists 
had 4 red, 4 blue, and 4 green responses. In order 
to maintain this balance of stimuli and responses 
Across lists, it was necessary to randomly assign 5 
Congruent and 4 noncongruent responses to one list 
and the other 4 congruent and 3 noncongruent 
Tesponses to the other list. The two lists were as 
follows: List 1: 5 congruent—ZAH-RED, SKAF-RED, 
MU-BLUE, OOM-BLUE, ISH-GREEN; 4 noncongruent— 
NERD-RED, TUR-BLUE, SOOL-GREEN, KLAK-OREEN; and 
3 unrelated—kor-mED, EP-BLUE, ETH-GREEN. List 2: 

COngruent—KLAK-RED, SOOL-BLUE,  NERD-GREEN, 
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TUR-GREEN; 5 noncongruent—ISH-RED, MU-RED, ZAH- 
BLUE, SKAF-BLUE, OOM-GREEN; and the 3 unrelated 
pairs. 

Four random orders of each list were used in 
order to minimize serial learning. One restriction was 
placed upon the randomization procedure; a par- 
ticular pair was not permitted to follow itself for 
at least six pairs. 

Procedure. The lists were presented on a memory 
drum set at a 2:2-second rate with an interpair 
interval of 1.5 seconds. On first presentation subjects 
read each pair aloud. They were told that a syllable 
would appear in the left-hand window of the 
memory drum and then the name of a color would 
appear in the right-hand window. The instructions 
were to read each one aloud as it appeard. On the 
subsequent 11 test trials, subjects anticipated the 
color. Subjects were instructed not to read the 
syllables aloud but to name the colors which go 
with them before they appear on the right. Subjects 
were urged to give a response each time even if 
they had to guess. 

Subjects. Forty undergraduate students at the 
University of California served as subjects. Twenty 
were tested on List 1 and 20 on List 2. Seven of 
the subjects were able to speak another language 
besides English (2 Spanish, 1 German, 1 German 
and French, 1 Italian, 1 Japanese, and 1 Swedish). 


Results e 


Congruent pairs were more easily learned 
than noncongruent or unrelated pairs. Table 
1 presents the overall mean number correct 
per pair. The mean number of total correct 
responses was 8.65 for the congruent, 7.79 
for the noncongruent, and 7.45 for the 
unrelated pairs. The difference between 
treatments (congruent versus noncongruent 
versus unrelated) was a significant source 
of variance (F = 3.91, df = 2/36, p < .05). 
Comparison by multiple ¢ tests yielded sig- 
nificant differences between congruent and 
noncongruent pairs at less than the .05 level 
(t= 2.53) and between the congruent and 
unrelated pairs at less than the .01 level 
(t = 2.93). Although in the predicted direc- 
tion, the mean difference between non- 
congruent and unrelated pairs was not sig- 
nificant, Neither Lists (F — .31, df — 1/38) 
nor Lists X Treatments (F= 1.06, df= 
2/16) were significant sources of variance, 

Comparison of the number of correct re- 
sponses when a given stimulus was paired 
with a congruent versus a noncongruent 
response indicates that seven of the nine were 
in the predicted direction. Further analysis 
by multiple ¢ tests showed that the differ- 


430 


TABLE 1 


Correct RESPONSES ON MIXED Lists WHEN SYMBOLS 
WERE PAIRED WITH CONGRUENT, NONCONGRUENT, 
AND UNRELATED RESPONSES 


Responses 


Non- 


congruent Unrelated 


Congruent 
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ences for four of these—mu, oom, soot, and 
IsH—were significant at less than the .05 
level. Two reversals of direction obtained for 
ZAH and skar when paired with the non- 
congruent response BLUE, but the differences 
do not approach significance. 

More overt errors were committed on the 
noncongruent than congruent pairs: mean of 
1,59 versus 1.18. Although in the expected 
direction, when corrected for error opportuni- 
ties, the difference is not statistically sig- 
nificant. 
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Apparently all subjects were aware after 


the presentation trial that only three color ` 


responses were appropriate. Consequently, 
the only incorrect responses given for the 
noncongruent pairs were one of the two other 
colors. In terms of phonetic symbolism, the 
incorrect responses which were given were 
therefore either a  noncongruent or the 
congruent color referent. Almost twice as 
many congruent as noncongruent incorrect 
responses were given to noncongruent pairs— 
a mean of 1.85 versus .97. Determination of 
the number of subjects who made more con- 
gruent than noncongruent errors yielded à 
distribution which was significantly different 
from .5 at less than the .001 level, using the 
binomial test, two-tailed. 


EXPERIMENT II: UNMIXED Lists 
Method 


Lists. A congruent and a noncongruent list were 
constructed, The basis for selecting congruent and 
noncongruent pairs was the same as in Experi- 
ment I, with a single exception: the words to which 
the sound patterns had been consistently matched 
were “coop” or "pap." f 

Four coop and four sap pairs were utilized. Thus, 
each list consisted of eight pairs. On the congruent 
list, the four stimuli for the coop response Welt 
EP, ZAH, ZING, ITE, and for the BAD response they 
were ISH, SKRINT, IK, NERD. On the noncongruent 
list the stimuli and responses were reversed. 

Six random orders were used. These were the 


same for both lists. Thus, the stimuli appeared in 


exactly the same place on both lists. Only the 
responses were different. 
Procedure. Same as Experiment I with the excep 
tion that only five test trials were used. 1 
Subjects. Seventy undergraduates at the University. 
of California served as subjects. Thirty-five wet 
tested on each list, All the subjects were volunteers: 


Results 


Congruent responses were more easily 
learned than noncongruent responses to the 
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Fic. 1. Learning curves for unmixed lists. 
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same stimuli. The mean was 7.10 correct for 
the congruent and 6.10 for the noncongruent. 
Figure 1 shows the mean number correct per 
trials for the congruent and noncongruent 
pairs. It is apparent that the number correct 
was smaller on all five test trials for the non- 
congruent pairs than for the congruent pairs 
on the first test trial. 

Analyses of variance performed on these 
data indicate that treatments were significant 
sources of variance (F = 19.35, df = 1/68, 
p < 01). It can be further seen from Fig- 
ure 1 that the disparity in mean number 
responses correct for congruent and non- 
congruent pairs increased over trials, from a 
mean difference of .56 to 1.29. However, 
only the trials was a significant source of 
variance (F = 4.57, df = 4/272, p< 01), 
the interaction between treatments and trials 
was not (F — 1.64, df — 4/272). 

Pairs (F = 3.80, df = 7/416, p < .01) and 
the interaction between treatments and pairs 
(F= 4.65, df = 7/476, p « 01) were sig- 
nificant sources of variance. Higher fre- 
quency of correct responses obtained when 
the response was congruent for seven of the 
eight pairs (see Table 2). The difference for 
six of the seven was significant, using multi- 
ple ¢ tests. The only reversal obtained for 
ISH, which was remembered slightly better 
when paired with the noncongruent response, 
coop, but the difference does not approach 
significance. 

Subjects made a mean of 1.10 overt errors 
in response to each noncongruent pair and 
.57 to each congruent pair. When corrected 
ior error opportunities, a test of the mean 
differences yields a £ with the probability 
of <,06 


EXPERIMENT III: CONTROL OF LETTER 


COMMONALITY 
Method 


Lists. Separate unmixed lists of nine congruent 
and nine noncongruent pairs were used. In order to 
obtain more noncongruent than congruent pairs 
with common letters, responses had to be drawn 
from three referent dimensions, that is, color, simple 
space, and direction. Five of the noncongruent and 
three of the congruent pairs used had one or more 
common letters in their stimulus and response 
members. Moreover, all the pairs which had com- 
mon letters when matched congruently also had 
common letters when matched noncongruently. 
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TABLE 2 


Correct RESPONSES ON UNMIXED Lists WHEN 
SYMBOLS WERE PAIRED WITH CONGRUENT 
AND NONCONGRUENT RESPONSES 


Responses 
Mean 
difference 
Congruent Noncongruent 
ED 
M 4,49 4.09 +0.40* 
SD 1.0 11 
ZAH 
M 4.31 4.26 +0.05 
SD 1.0 1.24 
ZING 
M 4.89 3.63 1.36 
SD 0.7 15 
ITE 
M 4.09 3.09 +1.00 
SD 1.3 2.7 
ISH 
3.77 441 —0.34 
SD 12 14 
SKRINT 
M 4.40 3.41. +0.69 
SD 1.0 14 
IK 
M 4.69 3.43 +1.20 
SD 0.7 2.7 
NERD 
M 4.60 o 4.20 +0.40 
SD 1.0 1.0 5 
Total 
M 7A 6.1 1.00% 
SD 14 13 
25 Sits: 
A «.01. 


The congruent list was: WAP-RED, nerd-green, 
MIP-YELLOW, BEP-SHORT, SOOL-WIDE, ZAH-UP, TUR- 
pown, haf-right, eth-left. The noncongruent list 
was; WAP-GREEN, merd-yellow, MIP-RED, BEP-TALL, 
sool-narrow, ZAH-DOWN, tur-up, haf-left, eth-right. 
All pairs which have letter commonality are printed 
in italics. 

Noncongruent pairs on the simple spatial and 
directional dimension were constructed by pairing 
the phonetic symbol with the opposite of its con- 
gruent referent, for example, the congruent pair 
BEP-sHORT was made noncongruent by making the 
response be the opposite, namely, TALL. Since this 
was not possible in constructing noncongruent pairs 
for the color dimension, color responses were ran- 
domly assigned to phonetic symbols. 

Three random orders of the lists were used. Since 
these orders were the same on both lists, the stimuli 
appeared in exactly the same place on both lists. 
Only the responses were different. 

Procedure. Same as Experiment I, with the excep- 
tion that nine test trials were used. 

Subjects. Twenty undergraduates at the University 
of California served as subjects. Ten were tested on 
each list. All the subjects were volunteers, 


432 


Results 


Congruent pairs were learned more rapidly 
than noncongruent pairs even though more 
noncongruent pairs had common letters in 
their stimulus and response members. The 
mean number correct was 6.41 for the con- 
gruent and 5.05 for the noncongruent pairs. 
Figure 2 presents the mean number correct 
per trial for the congruent and noncongruent 
lists. The curves never cross, indicating that 
for each trial more congruent than noncon- 
gruent pairs were learned. 

Analyses of variance were performed upon 
the data, Treatments were a significant source 
of variance (F = 5.97, df = 1/18, p < .05). 
Trials were also significant (F = 65.94, df = 
8/136, p < .01) but the interaction of Treat- 
ments X Trials was not (F = .67, df= 8/ 
136). 

Differences between pairs (F — 9.63, dj — 
8/136, p < .01) and Treatments X Pairs (F 
= 1.85, df = 8/136, p< .10) were signifi- 
cant sources of variance. Table 3 shows that 
the mean number of correct responses was 
always higher when the phonetic symbols were 
paired with congruent than with noncongru- 
ent responses. The difference was significant 
for three of the nine pairs, using ¢ tests. It 
should be noted that no letter commonality 
obtained on either the congruent or noncon- 
gruent condition for the significant pairs. 


Discussion 


Congruent pairs were learned more rapidly 
than noncongruent pairs, even when more 
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is controlled. 
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TABLE 3 


CORRECT RESPONSES WHEN LETTER COMMONALITY 
Is CoNTROLLED 


Responses 
Mean 
difference 
Congruent | Noncongruent 
BEP 
M 4.80 1.40 3.466 
SD 2.68 1.56 
MIP 
M 6.20 4.20 2.0€* 
SD 1.89 2.09 
ZAH 
M 6.70 3.70 3.0 
SD 1.55 2.45 
HAF 
M 5.40 4.10 13 
SD 2.37 2.62 
NERD 
M 7.80 7.40 0.4 
SD 0.98 1.56 
SOOL 
M 6.40 5.90 0.5 
SD 1.62 1.58 
TUR 
M 6.60 6.20 0.4 
SD 1.74 2.40 
WAP 
6.60 6.00 0.6 
SD 2.37 2.53 
ETH 
M 7.20 6.50 0.7 
SD 1.54 2.20 
Total 
M 6.41 5.05 1.36* 
SD 2.10 2.80 
*p <1 
»* 5 c .05. 
ep <0 


noncongruent pairs had members with com- 
mon letters. Symbolically related palts, 
whether congruent or noncongruent, were 
learned more quickly than pairs which were 
unrelated: the difference between congruent 
and unrelated pairs was statistically signifi- 
cant while that between noncongruent an 
unrelated was not. Subjects tended to make 
more referent “errors” in remembering the 
noncongruent responses. These “errors” were 
predominantly congruent rather than non 
congruent, The results appear to be true for 
a range of phonetic symbols and their verbal 
referents along different dimensions of mean- 
ing, both when mixed and unmixed lists are 
used as methods of presentation. P 
These positive findings are consistent with 
the organismic-developmental theory of nal 
ural reference (Langer & Rosenberg, 1? b 
Werner & Kaplan, 1963) as the first gene? 
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stage of symbol formation. The theory as- 
serts that symbols may refer to things with 
which they are not related by arbitrary con- 
vention. Moreover, natural symbols may rep- 
resent things even when they are not direct 
or explicit copies of the referents. The process 
whereby individuals establish nonmimetic, 
representational relationships between asso- 
* ciates is that of metaphorically * relating the 
implicit forms of the symbols and the refer- 
ents, Werner (1948) has characterized this 
process as physiognomization to indicate that 
the “inner form of life” of the symbols and 
of the referents are molded to each other 
until the individual is satisfied that they fit 
metaphorically. Thus, although the explicit 
forms of the symbols and the referents may 
not have changed from the viewpoint of the 
external observer, they have been transformed 
in the perception of the subject into symbols 
whose implicit forms fit that of the referents 
which they represent. The inner forms of the 
symbols affect and are affected by those of 
the referents to which they are linked. 

This theoretical formulation provides the 
basis for explaining the positive results ob- 
tained in the present study. Congruent pairs 
were learned more easily than noncongruent 
pairs since they were composed of symbols 
which naturally fit their referents (as inde- 
pendently determined by the normative study 
of Langer & Rosenberg, 1964). Both con- 
gruent and noncongruent pairs were easier to 
learn than unrelated pairs because congruent 
and noncongruent symbols and referents 
could be naturally related to each other, 
whether in a synonymous or antagonistic 
fashion, The symbols and referents compos- 
ing the unrelated pairs, on the other hand, do 
not naturally fit each other in any way and 
were therefore most difficult to remember. 
This provides further support for the asser- 
tion made by those who adhere to the view 
of the physiognomic origins of language 
(Kühler, 1937; Scheerer & Lyons, 1957; 
Werner & Kaplan, 1963; etc.) that certain 


? We refer to the process as metaphorical because 
a metaphor is defined in rhetoric as the “use of 
a word or phrase literally denoting one kind of 
Object or idea in place of another by way of sug- 
gesting a likeness or analogy between them [Web- 
ster’s New International Dictionary, 19551.” 
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vocal forms are intrinsically more appropriate 
for exploitation as symbolic vehicles to con- 
vey meaning. It also suggests the necessity 
for further investigation to determine the 
dynamic and structural characteristics of vo- 
cal forms which do and do not lend them- 
selves to symbolic utilization and whether the 
selection of characteristics and the mean- 
ing(s) ascribed to them are culture bound or 
free. 

The above described theory of natural 
reference also accounts for the subjects’ strik- 
ing tendency to "err" congruently when pre- 
sented with noncongruent pairings. The pro- 
duction of congruent responses reflects the 
tendency to reconstruct the pairs which do 
not physiognomically fit each other in such a 
way that the symbols will come to naturally 
represent the referents. This conceptualization 
suggests the necessity for further learning 
studies in which the required oral responses 
would be the symbols and/or the referents. 
The concept of physiognomization implies 
that when presented with noncongruent pair- 
ings there should be a tendency to recon- 
struct both the symbols and the referents 
until they metaphorically fit. It may there- 
fore be expected that sensitive techniques of 
phonological analysis might detect subtle 
but congruent “errors” in pronounciation of 
the symbols in addition to the referent “er- 
rors” found in the present study. Whether 
these “errors” in pronounciation more fittingly 
represent the referents with which they are 
paired could be determined by further match- 
ing experiments using these new pronouncia- 
tions as the symbols to be matched with re- 
ferent choices. 
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CODABILITY OF COMPLEX STIMULI: 
THREE MODES OF REPRESENTATION * 


FRANK KOEN 


University of Michigan 


The study's goals were: evidence for 3 representational modes, and the ap- 
plication of verbal and motoric communication accuracy to facial expressions. 
The stimuli were 24 photographs of the same individual. In verbal codability, Ss 
communicated the expressions in writing; in enactive codability, by imitating 


them. In 3 recognition conditions, Ss 
the expression, or counted backward. Bo: 


described the pictures aloud, imitated 
th verbal and enactive codability were 


significantly related to verbal recognition (r=.57 and .64, respectively). 
Written verbal communication was more accurate (42%) than enactive (28%). 
Recognition success was related to the number of freely operating encoding 
systems. The results support Bruner’s verbal and iconic representational modes; 


these stimuli are communicable on bot! 


Codability may be considered as the ca- 
pacity of a stimulus for being precisely repre- 
sented—thereby facilitating its symbolic 
manipulation across time and space. In at- 
tempting an experimental approach to the 
language-cognition relationship within a lan- 
guage community, earlier studies of codability 
dealt with the connection between the charac- 
teristics of verbal encodings of a stimulus and 
the performance of subjects in a recognition 
task involving short-term memory for the 
same stimulus. In these studies, three meas- 
ures of codability were developed: agreement 
across subjects in applying verbal labels to a 
stimulus (Brown & Lenneberg, 1954; Van de 
Geer & Frijda, 1961), the brevity of the 
verbal descriptions which are evoked by a 
stimulus (Glanzer & Clark, 1963), and com- 
munication accuracy. With respect to the last, 
Lantz (1963), working with Munsell color 
chips, suggested that codability be considered 
the accuracy with which a verbal label ap- 
plied to a stimulus by one person can be 
used by a second to identify the stimulus in 
an array. She also made the important point 
that the nature of the stimulus array itself 
may play a major role in determining which 

l'This investigation was conducted during the 
authors tenure as an NSF postdoctoral fellow 
(Fellowship No. 43042) at Harvard University, The 
author is deeply indebted to Roger Brown for his 
assistance and advice in the design of the experi- 
um and for his review of the manuscript. Use of 

e facilities of the Harvard Computing Center was 
Supported by NSF Grant GP 2723. 


h verbal and motoric levels. 


codability measure shows the strongest rela- 
tionship with recognition performance, 
Bruner (1964) proposed the existence of 
three representational modes in human beings 
which are thought to appear in regular se- 
quence in the life of an individual and which 
operate in the encoding of external events. 
They are, in order ef appearance, the enac- 
tive, the iconic, and the verbal. In the en- 
active mode, the external world is represented 
in terms of the physical interactions possible 
with it—involving the individual's voluntary 
muscles. In the iconic mode, encoding is ac- 
complished by means of mental images; and 
in the verbal, overt and covert linguistic de- 
scriptions of events serve the representational 
function, Bruner speculated that each mode 
depends on the previous one for its develop- 
ment and that the earlier modes are never 
completely abandoned, but probably remain 
more or less intact throughout life. To fore- 
stall semantic confusion, it is well to remem- 
ber that Bruner’s enactive representation 
would be termed iconic in the usual sense that 
the enactment is imitative of the referent. 
When subjects are presented with a chal- 
lenging recognition task involving memory for 
complex stimuli like human facial expressions, 
they may be expected to use their full re- 
sources for the accurate coding and storage of 
information. This should result in any repre- 
sentational mode which is available being 
brought into play. While the principal fea- 
tures of verbal and iconic representation are 
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readily apparent, the same may not be true 
for enactive encoding. It has been found 
(Langfeld, 1918) that subjects, faced with 
the task of identifying and naming the ex- 
pressions in photographs, report sympathetic 
imitation of the expressions as a helpful de- 
vice. It is assumed that the information 
available through each encoding system has 
both unique and redundant features. The 
present study attempts to interfere with the 
operation of one or more of the three sug- 
gested modes while facilitating the other(s). 
Performance in a recognition task should then 
change both quantitatively and qualitatively, 
depending upon which and how many encod- 
ing systems are allowed to function freely. 
Four inferences (and associated predic- 
tions) suggest themselves. First, the accuracy 
with which the stimulus items can be verbally 
communicated between subjects should be 
positively and significantly correlated with 
recognition performance by a single subject, 
so long as the operation of the verbal system 
(in the recognition task) is unimpeded. When 
interference is introduced, the correlation 
should decrease. In terms of this experiment, 
verbal codability should best predict verbal 
recognition. Second, the accuracy of motoric 
(gestural) communication should be posi- 
tively and significantly correlated with intra- 
individual recognition performance in the ab- 
sence of interference (in the recognition 
task) with the enactive representational sys- 
tem, that is, enactive codability should best 
predict enactive recognition, Third, assuming 
that the operation of the three modes is addi- 
tive, it would follow that the greater the num- 
ber of encoding systems used by a subject, 
the more information he can encode and 
Store, and the better his retention, There 
Should be an inverse relation between the 
number of systems experimentally inhibited 
in the recognition task and performance in 
that task, In terms of this experiment, the 
highest level of success should occur in the 
enactive recognition condition, where it is 
assumed that all three representational sys- 
tems are fully available to the subject. Verbal 
recognition, in which experimental interfer- 
ence is introduced into the enactive system, 
should be second best; iconic recognition, 
where both enactive and verbal systems are 
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inhibited, should show the lowest perform | 
ance level, since only the iconic mode is left, 
to function freely. Fourth, a significant 
amount of the information necessary to iden. | 
tify stimuli of the kind used here can be 
communicated through motoric channels, in- 
dependent of public verbalization. In terms 
of this experiment, the level of enactive com- 
munication accuracy should be significantly’ 
greater than chance. 


METHOD 
Stimuli 


The experimental stimuli were 24 of the 72 Frois- 
Wittmann photographs (Hulin & Katz, 1935), Half 
of the pictures in the entire collection were selected 
at random and the difficulty of each in a recogni- 
tion task was determined by pretest. The 24 ex 
perimental stimuli were chosen because they formed 
a comparatively symmetrical distribution with 4 
wide range of difficulty levels. 


Subjects 


All subjects were paid volunteer students at 
Harvard University Summer School. Subjects were 
randomly assigned to groups in rotation, with the 
exception of the verbal codability sequence. It was 
necessary that all encoders complete their task be 
fore any decoders could be run. There were 8 sub- 
jects in each of three recognition conditions} 8 
verbal encoders; 16 verbal decoders; 12 pairs M 
the enactive codability condition; and 6 subjects 
were used to obtain an estimate of the perceptual 
discriminability of the experimental stimuli. Th 
male/female ratio was 1 to 3 in all groups except 
enactive codability, where there were four mixed-sex 
pairs, two all-male pairs, and six all-female paitsi 
and in the test of discriminability, where the ratio 
was 1 to 2. No subject served in more than one 
condition; all subjects were run singly except the 
enactive codability pairs. Subjects were given uo 
feedback as to the accuracy of their performance. 


Procedure 


The codability conditions were designed to & 
tablish the accuracy with which the expressions ii 
the pictures could be communicated between Mi 
viduals, ‘ail 

Verbal codability. The 24 pictures were mU 
on separate 3 X 5 cards and were presented to 
subject one at a time. In full continuous view of br 
verbal and enactive codability subjects was & three 
section chart containing all 72 pictures of the i 
in a single random arrangement (hereinafter ca 
“the array”). Within all groups, the experimen 
items were presented in a different random order si 
each subject. Communication between subjects m 
unidirectional and entirely by means of written o 
sages. It was expected that, due to these conditio 
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“a verbal representational system was the only one 
active. In the encoding phase, the subject first de- 
coded two messages written by a previous encoder, 
then wrote descriptions of two pictures of his own 
choice which the experimenter decoded at once. 
Ambiguities, irrelevancies, and omissions (in these 
practice items only) were discussed. The subject 
then encoded 4 practice items (not scored) and the 
24 experimental stimuli. Each picture was exposed 
for 30 seconds, during which the subject wrote a 
description “so someone else could identify the 
picture from your description." The procedure gen- 
erated 8X24, or 192 messages. In the decoding 
phase, each message was decoded by two different 
subjects; the 8 messages written in description ofa 
given picture were thus decoded a total of 16 times. 
Each decoder worked with 4 practice messages (not 
scored), followed by one description of each of the 
24 experimental stimuli, including 3 messages from 
each encoder. Messages were presented singly and 
search time was unrestricted. Each item received a 
weighted score, obtained in the following way. The 
total number of successful transmissions with which 
the encoder of the item was associated was added to 
twice the total number of successes with which the 
decoder of the item was associated. This was done 
to equalize the relative contributions of encoder and 
decoder to the item weighted score. Each encoder- 
decoder pair was assigned a total score of 1.00, 
which was divided by the combined success-total 
obtained above. For example, assume that Encoder 
A had 26 messages correctly decoded (across all 
decoders), and Decoder B accurately decoded 7 
messages (across all encoders). Their combined suc- 
cess-total of 40 (26+[2%7]) would give a 
weighted score of 025 for an item which A had de- 
scribed and B had successfully identified from A's 
description, The weights so obtained were summed 
across all encoder-decoder pairs to give a verbal 
codability score for each item. The same general 
rationale of assigning weighted scores to the items 
was followed in enactive codability and in the three 
recognition conditions. In this way, the contribution 
of a subject to an item’s score was inversely related 
to the total number of successful identifications 
with which he was associated (as encoder, decoder, 
or recognition subject). 

Enactive codability. Two subjects were seated 
facing each other across a table; they served alter- 
nately as sender and receiver of the expressions. 
The sender viewed a single picture in a mirror (to 
compensate for left-right reversals), and, without 
Seeing his own face, duplicated the expression of the 
picture; the receiver then identified in the array 
the picture being imitated. No verbal cues were al- 
lowed, nor could the receiver see the (sender’s) 
Picture. It was expected that, under these conditions, 
the role of the sender involved principally the en- 
active representational system and that of the 
receiver, the iconic system. Subjects alternately sent 
and received a total of 8 practice items (not scored), 
followed by 24 experimental items. No time limits 
were imposed. Each item received a weighted score 
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obtained in the following manner. Each sender- 
receiver pair was assigned a score of 1.00, which 
was apportioned equally among all correct trans- 
missions, Thus, if A as sender and B as receiver 
transmitted 5 expressions correctly, the weighted 
score for each (correct) item was .20; if B as sender 
and A as receiver accurately transmitted 3 items, 
the score for each item was .33. These weights 
were summed across all sender-receiver pairs to 
obtain the enactive codability score for each item. 

Recognition. The recognition conditions were 
designed to establish the accuracy with which ex- 
pressions could be identified in the array by a 
subject following a brief exposure and a 30-second 
filled interval. The 24 stimulus pictures were mounted 
in pairs on 4 X 6 cards. Four complete sets of cards 
were made, each picture randomly paired with four 
other pictures, and appearing equally often as left 
and right member of the pairs. The cards were ex- 
posed singly in a tachistoscope. The array was 
placed so that it was effectively out of the field 
of vision of the subject when he was facing the 
tachistoscope. Within all groups, the experimental 
items were presented in a different random order to 
each subject. 

The enactive recognition condition was planned 
so that all three encoding systems were maximally 
available to the subject. A pair of pictures was 
exposed in a tachistoscope for 2.5 seconds; then 
followed a 30-second, interval during which the 
subject continued to look straight ahead and tried 
“to reproduce both expressions as exactly as you 
can several times, and also try to remember the 
pictures in any other way you choose.” After 30 
seconds, the subject turned to the array and identi- 
fied the pictures seen in the tachistoscope. The first 
4 pairs were practice items (not scored) ; they were 
followed by 12 experimental pairs. Verbal recogni- 
tion was designed to interfere with the enactive 
system, while allowing full use of the verbal and 
iconic modes. All conditions were the same as in 
enactive recognition except that during the 30- 
second interval, the subject described aloud the two 
pictures he had seen; the experimenter recorded the 
descriptions verbatim. The idea was that the subject 
would find it difficult to talk and make the expres- 
sions at the same time. In iconic recognition, experi- 
mental interference was introduced into both enactive 
and verbal systems, leaving the iconic as the only 
untrammeled mode available to the subject. Condi- 
tions were again the same, except that during the 
30-second interval, the subject counted aloud back- 
ward by either 3s or 7s from random starting points 
designated by the experimenter, It was thought that 
an unaccustomed task requiring the subjects at- 
tention would interfere with covert verbalization, and 
the counting aloud would block the making of the 
expressions. In each recognition condition, a weighted 
score for each item was obtained in a manner parallel 
to that followed in codability. 

Discriminability. The following procedure was used 
to estimate the perceptual difficulty of discriminat- 
ing each experimental item from the others in the 
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TABLE 1 


Propuct-MoMENT CORRELATIONS BETWEEN 
EXPERIMENTAL CONDITIONS 


Condition 2 3 4 5 6 
1. Verbal codability |.46*|.57** |.45*|.28 | —.41* 
2. Enactive codability| .64** | 34 |.15 | —.46* 
3. Verbal recognition 45*| .51* | —.54** 
4, Enactive recog- 
nition 35 | —.40 
5. Iconic recognition —.54** 


6. Discriminability 


array. The subject was seated directly in front of the 
middle section of the array and was given a duplicate 
print of one of the pictures with instructions to 
“match the one in your hand with its exact duplicate 
in the shortest possible time." If the subject made 
an incorrect choice, the experimenter said “No,” 
and the subject continued the search until he made 
a successful match. The discrimination score of an 
item was the median number of seconds required 
by the six subjects to find the item—hence the higher 
the score the lower the discriminability of the item. 
Since time was of the essence in this condition, it 
was necessary to balance the positions of the items, 
in order to compensate for the assumed normal 
left-right and top-bottom scanning patterns. The 
array consisted of three poster boards, each con- 
taining four rows of six pictures. Each board was 
presented in left, middle, and right positions one- 
third of the time, and the vertical order of the rows 
on each board was inverted from top to bottom for 
one-half the subjects. The subject identified 4 
practice items (not scored), followed by the 24 
experimental stimuli, interspersed with 13 pictures 
which had been the most frequent incorrect choices 
in the enactive recognition conditio. The items were 
presented to each subject in a different random order. 


RESULTS 
Verbal Codability-Verbal Recognition Relation 


It was predicted that verbal codability data 
would be significantly correlated with verbal 
recognition performance, and show lower rela- 
tionships with both enactive and iconic 
recognition. This prediction was supported 
(the correlations being .57, 45, and .28, 
respectively) as can be seen in Table 1. The 
partial correlations between verbal codability 
and verbal recognition with enactive cod- 
ability and discriminability (separately) held 
constant, were .41 (p < .06) and .46 p< 
.05), respectively. When both were partialed 
out, the correlation fell to .35. 
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Enactive Codability-Enactive Recognition 
Relation 


It was predicted that enactive codability 
data would be significantly correlated with 
enactive recognition performance, and show 
lower relations with verbal and iconic recogni- | 
tion. The respective correlations of .34, .64 
(p < .01), and .15 indicate that the predic- 
tion was not confirmed. Furthermore, with. 
verbal codability partialed out, the enactiye - 
codability-enactive recognition correlation 
dropped to .16. There was a small effect 
uniquely associated with the operations used 
to define enactive codability. On the other 
hand, the partial correlations between en: 
active codability and verbal recognition were 
.52 and .53 (both significant at the .02 
level) with verbal codability and discrimin- - 
ability, respectively, held constant. With both 
partialed out, the enactive codability-verbal - 
recognition correlation was .44 (p < .05). 


Comparative Performances in the Recognition 
Conditions 


The three sections of the array stood in the - 
same relative positions for all recognition sub: - 
jects. Though search time in the task was 
unrestricted, it was possible that successful | 
identification of an item was related to its | 
position in the array. Therefore, after an 
arc-sine transformation of the percentage of 
correct identifications of each item, a repeated 
measures analysis of variance was run, with 
the three section positions and the three 
recognition conditions as main effects. The | 
only significant effect was recognition CON- 
ditions (F = 4.899, df = 2/21, p < .05, one 
tailed). It was expected that the enactivé 
condition would show the highest success level, 
followed by the verbal and iconic conditions: — 
The results were in the predicted order, with 
57% correct identifications in enactive recog: 
nition, 52% in verbal, and 35% in icome. - 
Significant gap tests show that the enactive — 
and verbal conditions tended to be bettet - 
than iconic at the .06 and .07 points, T€. 
spectively. The prediction is considered to " 
partially confirmed by these results. Again, ! 
should be noted that the greatest drop 1 
performance level is associated with inhibition 
of the verbal system. 
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` Enactive Communication Accuracy 


In enactive codability, 28% of the trans- 
missions of the expressions of the pictures re- 
-sulted in correct identification of the item. This 
was far above chance level, of course, since in 
" each case the receiver had to choose 1 picture 

from the entire 72 picture array. This sup- 

¿ports the prediction that communication ac- 

+ curacy can be demonstrated on the motoric 
as well as the verbal level. Of course, verbal 
codability was much more accurate, resulting 
in 42% correct identifications—a difference 

* which is significant at the .002 level (Mann- 
Whitney test). 


Other Differences 


Female subjects performed significantly 
better than males in enactive and verbal 
recognition (Mann-Whitney test, U—:5.5,m 
=4, n, = 12, p < 05), while males tended 
to be superior in iconic recognition (Mann- 
Whitney, U 20, m —2, m=6 P< :07). 
The change in relative positions was due al- 
most entirely to shifts in female scores. No 
significant sex differences appeared in either 
the discriminability or codability tasks. 
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DISCUSSION 
Evidence for Three Modes 


The question of whether we now have 
experimental evidence for three representa- 
tional modes is only partially answered by 
the results of this study. It is apparent that 
the operations performed did not succeed in 
isolating any one mode in either the codabil- 
ity or recognition conditions, as evidenced 
by the correlations between the results. For 
instance, in verbal codability it was expected 
that the only representational system active 
was the verbal one. Between subjects, that 
was true. But many encoders employed an 
enactive process by imitating the expressions 
in the pictures while writing the messages; 
many decoders later tried to duplicate with 
their own faces the expression described in 
the message and then “matched my expres- 
sion with one of the pictures in the array,” 
as one reported in a postexperimental inter- 
view. The significant reduction of the verbal 
codability-verbal recognition correlation by 
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partialing out enactive codability becomes 
more understandable in the light of these 
circumstances. Of course, there are strong 
indications of a pervasive influence of the 
verbal system; a desire for parsimony tempts 
us to ascribe the differences in performance 
levels in the three recognition conditions to 
the concurrent amount of inhibition of verbal 
representation. But it must be remembered 
that partialing out enactive codability re- 
duces the strong relation between verbal 
codability and verbal recognition to a mar- 
ginally significant level, and the difficulty of 
individual items in verbal codability more 
accurately predicted ease of recognition in 
the verbal than in the enactive condition 
where there was no interference with covert 
verbalization, The question is how much of 
the systematic variance not accounted for by 
verbal encoding is associated with other 
representational modes. 

The independent effects of iconic function- 
ing are suggested by: an iconic recognition- 
discriminability partial correlation of —.49 (p 
< 05) with verbal eodability held constant, 
and the very low correlations of both verbal 
and enactive codability with iconic recogni- 
tion. The iconic mode is apparently closely 
related to perceptual discriminability, as might 
be expected from its hypothesized nature, 
with verbal codability playing a secondary 
role. 

It is difficult to make a case for the en- 
active mode on the basis of these results. 
Enactive codability was a better predictor of 
verbal recognition than of enactive recogni- 
tion, and its general contribution to the ob- 
served effects appeared minor. For example, 
in enactive codability, verbal communication 
between subjects was forbidden, and it was 
expected that the sender would employ only 
an enactive representational mode, However, 
nearly all senders reported covert verbaliza- 
tion during the task. More importantly, re- 
ceivers seldom overtly imitated the senders’ 
expressions with their own faces, as enactive 
recognition subjects were required to do. 
Hence, receivers apparently encoded and 
stored the expressions verbally and/or iconi- 
cally. This methodological flaw probably con- 
tributed to the similarity (from a receiver’s 
viewpoint) between the enactive codability 
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and verbal recognition conditions which was 
disclosed by the unexpected .64 correlation. 
In the short intervals (typically ranging from 
5 to 20 seconds) between observing the 
sender's expression and perceiving a picture 
which appeared similar, the receiver was free 
to use either verbal or iconic system or both. 
That either would serve the purpose might 
be expected from the correlations of verbal 
recognition with verbal codability and dis- 
criminability (.57 and —.54, respectively). 
This ready availability of alternate encoding 
systems helps explain the fact that partialing 
verbal codability out of the enactive cod- 
ability-verbal recognition correlation only re- 
duced the figure fnom .64 to .52. 

It is probable that stimuli more usually 
represented in terms of muscular activity 
(e.g., tying one's shoelaces) may be more 
appropriate for demonstrating enactive en- 
coding. It should be pointed out that more 
than half the subjects reported (in postexperi- 
mental interviews) a tendency to reproduce 
the pictures’ expressions with their own faces, 
but only occasional subtle and fleeting overt 
movements were detected by the experi- 
menter. Possibly, in motoric encoding, only 
incipient, predispositional muscular involve- 
ment is necessary to serve the enactive 
representational function in adults. 


Verbal Codability of Complex Stimuli 


This study successfully extended the con- 
cept of verbal communication accuracy to 
complex, meaningful stimuli. We may con- 
clude that stimuli which can be readily 
identified in an array from freely structured 
verbal descriptions by other individuals are 
also the stimuli most likely to be correctly 
recognized by a single subject after a brief 
exposure and an interval in which he is free 
to verbalize, 


Concluding Observations 


Voluntary comments by some of the verbal 
recognition subjects invite the speculation 
that at least some human beings have covert 
verbal encoding systems with qualities dif- 
ferent from those of their public utterances. 
This is indicated by the complaint of some 
subjects that the necessity to describe the 
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pictures aloud interfered with their internal 
verbalizations about them; others correctly: 
identified pictures which they admitted did 
not agree with their own overt descriptions, 
but of which they were quite certain. Could it} 
be that one of the characteristics of these 
internal verbalizations is something akin to] 
the telegraphic style of the early speech of } 
children (Brown & Bellugi, 1964), which | 
drops functors such as prepositions, conjunc- 
tions, and articles? If so, we would expect 
that recall for those aspects of the environ- 
ment which are encoded by such words to be 
relatively poorer than features which are 
symbolized by content words. In that case, 
individual differences in associative hier- 
archies may provide a valuable entry to the 
study of the striking differences in encoding 
and decoding abilities exhibited by the sub- 
jects in this study. 

The picture of codability that emerges 
from the present study and those which pre: 
ceded it is an extremely complex one. In 
addition to at least two, and probably three, 
representational modes, experimental results 
are very sensitive to the nature of the stimu- 
lus array, and to the experimental procedures 
employed. Three interesting questions sug 
gest themselves, The first involves the rela- 
tive durability and resistance to distortion of 
information stored in each of the systems: 
For instance, after brief exposures of stimu: 
lus material, iconic encoding could be € 
pected to fade more quickly than verbal. In 
that case, extending the interval between 
exposure and recognition should result m m 
creasing the superiority of verbal over iconic 
recognition conditions. Second, increasing 
ability to make subtle perceptual discrimina- 
tions in highly specialized areas of human 
activity should be related to the increase 
in size and greater organization of the de 
vant technical vocabularies. Third, sim 
which are particularly adapted to "muse? 
storage" should increase the superiority 9 
enactive over verbal recognition conditions. 

At least four facets of codability ca? y 
discerned at this point: communication acci 
racy, brevity of message, distinctiveness 2 
message, and naming agreement. Eac 
them may well be involved in almos 
cognitive activities. For instance, problem. 
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.situations that evoke highly similar and rela- 
tively short, unambiguous verbal descriptions 
from multiple subjects should prove easier to 

.solve (by any one subject) than their op- 
posites. The present study supports Lantz’ 

' (1963) results, showing that communication 

accuracy is a useful predictor of recognition 

performance. Message brevity could be ex- 

pected to be an important variable in a situa- 

tion requiring choice behaviors under time 

pressure. When several messages must be 

stored concurrently (as in problems involving 

recognition of recurrent events) message dis- 
tinctiveness may play an independent role; 
this aspect of codability has not yet been 
investigated, Finally, it is probable that 
agreement between individuals in applying 
a given verbal label to a stimulus is related 
to the speed with which a single subject can 
respond to the stimulus. It seems unlikely 
that these possibilities exhaust the range of 
interactions involving codability. 
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EFFECTS OF THREE SOCIAL RESPONSES ON 
VASCULAR PROCESSES * 


JACK E. HOKANSON anp ROBERT EDELMAN 
Florida State University 


2 separate studies were conducted to investigate the effects of various social 
responses on vascular processes. It was found that: (a) receipt of a noxious 
stimulus caused by a “fellow S” produced systolic elevations of 6-10 mm.; (b) 
an aggressive counterresponse was followed by a relatively rapid return of the 
vascular measures to the prefrustration base level; (c) friendly or ignoring 
counterresponses were followed by a relatively slow return to base line com- 
parable to that of control Ss who were given no opportunity to respond; (d) 
the above results were obtained with the systolic blood pressure and a vaso- 
motor response but not with diastolic blood pressure or heart rate. These 
results were obtained only with male Ss, female Ss showing no differential 


recovery rates. 


In a series of earlier reports (Hokanson & 
Burgess, 1962a, 1962b; Hokanson, Burgess, 
& Cohen, 1963; Hokanson & Shetler, 1961) 
it was found that direct verbal or physical ag- 
gression towards an equal-status frustrater 
was associated with a rapid return of elevated 
systolic blood pressure tp prefrustration rest- 
ing level. Control subjects who were also 
frustrated but given no opportunity to ag- 
gress maintained systolic elevations of ap- 
proximately 12 millimeters which showed 
gradual recovery over an 8-minute period. 
While these results appear to be fairly re- 
liable, there are several important gaps in 
the data gathered thus far; and, a number of 
methodological deficiencies which the present 
paper attempts to rectify. Two studies will 
be reported herein, the second being an ex- 
panded replication of the first. 

In the earlier studies, frustrated subjects 
were placed in either of two conditions: op- 
portunity to counteraggress or a no-aggres- 
Sion control group. No information was ob- 
tained regarding the course of systolic pres- 
sure had the subject been given an opportunity 
to make a friendly or an ignoring counter- 
response, both of which are certainly possible 
behavioral alternatives in social situations of 
this kind. Thus, one purpose of the present 
studies is to observe the postfrustration vas- 
cular changes occurring when subjects choose 


l This research was supported by the Research 
Council, Florida State University. 
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to make either an aggressive, friendly, or ig- 
noring response towards the instigator. 

Second, in the initial studies of this series, f 
sex differences in systolic response to an im f 
terpersonal frustration and to subsequent op- | 
portunities to aggress were not systematically | 
investigated. Although both male and female | 
subjects were utilized, the experimenter 
(frustrater) was male, thus confounding the | 
sex variable. It was felt therefore that an f 
initial investigation in this area should have 
the frustrator and subject of the same sex. 

A third aim of the present study is to ob- 
tain more detailed vascular measurements, 
particularly with respect to the course of te- 
covery after a frustration-aggression (or other 
response) sequence. Studies heretofore have 
simply measured systolic pressure pre- ane | 
postfrustration and once immediately after 
the opportunity to aggress, leaving unsp! 
the subsequent systolic events. The present 
investigation hopes to demonstrate longe 
range vascular changes which are consistent 
with the earlier results. 

Finally, there are several procedural faults 
in the earlier studies which tend to limit the 
generality of the findings and make them 
difficult to replicate: the frustration manipula- 
tion involved face to face verbal harassmen 
by the experimenter while the subject Was 
ostensibly working on an intellectual tas oe 
procedure difficult to standardize and suscep” 1 
tible to variation across subjects and „expert 
menters; the types of aggression permit 
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the subject (delivering electric shocks to the 
frustrater or filling out a rating sheet evaluat- 
ing the experimenter) required a relatively 


_ lengthy and complicated instruction period 


(about 30 seconds) which could have been 


' frustrating itself, thereby producing uncon- 


trolled changes in systolic level. Last, in all 
prior studies, the experimenter (frustrater ) 
himself measured blood pressure during the 
experimental procedures, hence contaminating 
the frustrater with the experimenter role; 
and, since the experimenter was male, pos- 
sibly producing differential systolic responses 
between male and female subjects. The pro- 
cedures in the present study attempt to over- 
come these methodological problems. 


Srupy I 
Procedure 


The subjects in this experiment consisted of 12 
males and 16 female undergraduate volunteers be- 
tween the ages of 18 and 24. All subjects indicated 
that they had no history of heart trouble and were 
also willing to participate in an experiment involving 
electric shocks. Only 3 female subjects had to be 
initially excused because of unwillingness to undergo 
shock. The experimental groups consisted of 7 males 
and 11 females, while the control groups contained 
5 subjects of each sex. Each subject was seen in- 
dividually for one session lasting approximately 60 
minutes. 

Subjects were seated in an individual booth sepa- 
rated by several feet from a “fellow subject” (an 
experimental accomplice of the same sex as the sub- 
ject) in a similar booth. On a panel at the front of 
each booth were mounted three response keys 
labeled “Shock,” “Reward,” and “No Response.” A 
red and white light were also placed on the upper 
part of each panel. 

Each subject had finger electrodes attached to the 
first and second digits of the left hand. Electric 
shocks could be delivered through these electrodes 
via a Grass Model S4 stimulator. In a preliminary 
testing procedure the “pain” threshold of each subject 
was obtained on three occasions, with the final volt- 
age reading being the intensity of shock delivered 
during the experimental procedures. 

. The following instructions were given 
Ject at the start of the session: 


to each sub- 


This is an experiment dealing with interpersonal 
behavior. In your booth there are three buttons 
labeled Shock, Reward and No Response. If you 
Push the Shock button, a painful electric shock is 
given to the other subject. If you push the Reward 
button, the other subject receives a point; and this 
is indicated by the red light in his booth flashing 
on. If you push the No Response button, nothing 
will happen to the other subject, and this simply 
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indicates that you do not care to respond on 
this trial. 

We have tried to set up this procedure so that 
it is comparable to real-life situations; that is, if 
you do not like someone, or what they are doing, 
you might try to hurt them. This would be 
comparable to a shock response here. If you like 
someone, you might try to do something nice for 
them, and this is similar to a reward response 
here. Finally, there are times when you just 
don't feel like interacting and this corresponds to 
a no-response choice in this experiment. 

Here is the way the procedure will work: when 
the white light in your booth comes on you are 
to decide which of the three responses you are 
going to make. Do not, however, respond until 
the white light comes on a second time. Are there 
any questions? 


At this point the experimenter answered any ques- 
tions by rereading the relevant portions of the 
instructions. The experimenter also indicated that 
the subject would have his blood pressure recorded 
throughout the first portion of the session, thereby 
implying that the other "subject" would be simi- 
larly treated in the latter part of the session. At 
this point an experimental assistant attached the 
blood pressure cuff to the subject’s right arm and 
proceeded to measure systolic level with a sphyg- 
momanometer at 20-sgcond intervals throughout a 
10-minute rest period. 

The experimenter regulated the onset of the white 
lights (ready and response signal) in each booth, 
thus controlling the pace of the various events 
during the session. For the experimental subjects the 
following sequence was carried out on each “trial”: 
ready signal to the confederate, signal to the con- 
federate to respond, ready signal to the subject, 
signal to the subject to respond. The interval be- 
tween each of these events varied randomly between 
10 and 20 seconds from trial to trial. The apparatus 
was wired so that no matter which key the confed- 
erate pushed on a trial, the subject always received 
a shock. Since no verbal communication was allowed 
during the procedure, the subject was in effect 
facing a consistently punitive, unfriendly opponent. 

The subject’s systolic blood pressure was recorded 
immediately after each of these events. After the 
subject made his response, thus ending a trial, his 
systolic pressure was recorded at 20-second intervals 
until it reached the original resting level. Thus, the 
intertrial interval varied across trials and subjects 
depending on systolic recovery rate. The experimental 
portion of the session involved five such trials. 

This procedure involved then a situation in which 
the subject underwent five painful stimuli presum- 
ably caused by a fellow subject. In response to each 
of these shocks, the subject was to choose one of 
the three counterresponses available to him: shock, 
reward, or no response. A tally of which of these 
responses Was made on each trial was also recorded. 

The control subjects were given the same instruc- 
tions, with the additional comment that one or 
the other of the "subjects" might get several chances 
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Fic. 1. Percentage of systolic blood pressure eleva- 
tion during trial and recovery period as a function of 
type of response made (males, Study I). 


to respond in a row before the other subject would 
get a turn. Following these instructions the trials 
began with the confederate getting five consecutive 
opportunities to respond (deliver shocks) while the 
subject never got a ready signal nor a signal to 
respond. Again, systolic pressure was recorded at 
20-second intervals after the receipt of shock until 
it reached the original resting level. Thus, the control 
subjects underwent the same five painful stimuli 
delivered by a “fellow subject” but did not have an 
opportunity to make a counterresponse of any kind. 

At the completion of the session, each subject was 
informed of the deception involved, given a lengthy 
explanation of the purpose of the research, and 
sworn to secrecy. 


Results 


The systolic blood pressure data for males 
and females were treated separately and 
arranged in the following manner: prior to 
the onset of any particular trial the systolic 
basal level was noted. Elevations over this 
base level were then calculated for each event 
during the trial and during the posttrial re- 
covery period. Average systolic elevations 
were obtained for each trial event, keeping 
the data separate depending on the response 
the subject made on each trial; that is, shock, 
reward, no response, or control. Thus for each 
sex group four sets of data were obtained, 
each portraying the average course of sys- 
tolic pressure during trials on which a par- 
ticular counterresponse to the electric shock 
was made, 

Figure 1 presents the mean percentage of 
systolic elevations for male subjects over the 
trial events and the recovery period as a 
function of the response made by the subject 
on that trial. The systolic data were con- 
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verted to percentages of the maximum eleva: 
tion achieved after the receipt of shock in' 
order to equate the four sets of data for 
minor differences in systolic response to shock, 

As can be seen at the first data point a 


substantial amount of the initial increase in | 
pressure occurred when the confederate re- f 


ceived the ready signal (labeled RS/C on: 
graph). An audible click occurred in the ap- 
paratus when this ready signal was given to 
which the subject responded with a systolic 
increase. 

The second data point indicates the sys- 
tolic elevation immediately after the subject 
received the shock (labeled S/SH). It is 
apparent that at this point during each trial, 
systolic pressure is at its maximum. The mag- 
nitude of this response was approximately 10 
millimeters in each set of data. The small 
differences which did occur were statistically 
insignificant. 

The mean percentage of elevation in pres- 
sure immediately after the subject received 
the ready signal to respond (but before he 
actually responded) is presented at the third 
data point in Figure 1 (RS/S). Finally, the 
percentage of systolic elevation immediately 
after the subject responded is portrayed in 
Column 4 (S/RSP) along with the eleva- 
tions in each subsequent 20-second recovery 
period in the remaining columns. 

Analyses of variance and Duncan range 
tests were performed on the four sets of ele- 
vations at each point during the trial an 
recovery period. Prior to this the percentage 
data were inspected for normality and inde- 
pendence of means and variances and it was 
deemed appropriate to use the data in this 
form for the statistical analyses. The vertic2 
arrows on Figure 1 indicate those means 
which are significantly different from One 
another beyond the .05 level. Most note- 
worthy here is the dramatic drop in systolic 
level following an aggressive (shock) countet- 
response by the subjects. At no point in this 
data are the curves for the reward and igno" 
ing (no response) counterresponses signi?” 
cantly below that of the control subjects. 

A markedly different set of curves are 09- 
tained for the female subjects under these 
same experimental conditions (Figure 
Here it can be seen that the recovery curve 


l 
| 
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following a shock response does not differ 
"from the “reward” or “no-response” curves. 
Indeed, the rate of recovery for all three 
response curves is approximately that of the 
' male control group. Of added interest is the 
' uniformly higher control group curve for the 
females, suggesting that systolic elevation 
‘tends to be prolonged in a frustrating situa- 
. tion in which no type of social counterre- 
sponse is permitted. Apparently the well-worn 
masculine observation that women find it 
uncomfortable to just “keep still” has some 
' validity. 

Statistical comparisons of the data points 
on the recovery curves following a shock 
counterresponse were made between the males 
and females in order to verify the apparent 

- differential return to base line between the 
sexes. The males were found to have signifi- 
cantly lower (.05 level) pressures immedi- 
ately after making a shock counterresponse 
(S/RSP) and also at Recovery Trials 2 and 
3. Thereafter in the recovery period the two 
sets of data do not significantly differ. 

With respect to the behavioral data, there 

- was approximately an equal distribution of 
the three social responses obtained. There 
was also no appreciable difference in the use 
of the available responses between the male 
and female subjects. Lastly, the three re- 
sponses were equally distributed over trials 
for each sex group, so that the results cannot 
be attributed to one particular response pre- 
dominating in the early or late trials. 


Stupy II 


* Procedure 


The procedures utilized in this investigation were 
the same as Study I, with the following exceptions: 
only male subjects were used (W = 12); continuous 
recordings of heart rate and digital volume (plethys- 
mograph) were obtained throughout the procedure 
via a Grass Model 5 polygraph; diastolic blood 
Pressure was recorded with a sphygmomanometer 
along with systolic pressure; eight instead of five 
trials were used; on two trials the subjects were not 
given any opportunity to respond, thus having each 
subject serve as his own control with respect to 
vascular recovery rates. 


Results 


The systolic blood pressure data for these 
12 male subjects is presented in Figure 3. 
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Fic. 2. Percentage of systolic blood pressure eleva- 
tion during trial and recovery period as a function of 
type of response made (females, Study I). 


Here, the actual elevations of pressure are 
given in millimeters of mercury. In all other 
respects the graph is similar to Figure 1. As 
can be seen, the separation of the “shock re- 
sponse” curve from the rest is essentially the 
same as in the previous data. Here again, the 
arrows indicate differences significant beyond 
the .05 level. 

Figure 4 presents he data obtained from 
the plethysmographic records of each subject. 
Again, the data is portrayed in terms of 
change in digital volume (in cubic milli- 
meters) from pretrial resting level (zero point 
on the ordinate), It can be noted that the 
ready signal to the confederate and the re- 
ceipt of shock by the subject are accompanied 
by a peripheral vasoconstriction, with the 
subsequent recovery indicating a gradual 
vasodilation, As with the systolic data, the 
recovery rate following an aggressive counter- 
response is faster than the others. 
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Fic. 3. Systolic blood pressure elevation during trial 


and recovery period as a function of type of response 


made (Study II). 
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Fic. 4. Digital vasoconstriction in cubic millimeters 
over pretrial base level during trial and recovery 
period as a function of response made (Study II). 


Graphs of the heart rate and diastolic data 
are not presented due to the absence of any 
systematic change in either measure. Minimal 
movement away from resting level at the 
receipt of shock was obtained with each, thus 
precluding the observation of any differential 
recovery rates following the subjects’ counter- 
responses. Evidently more noxious stimulation 
is required to produce" consistent elevations 
in these two measures. 

As with the data in Study I the counter- 
responses to shock made by the subjects were 
approximately equally divided among the 
three responses available. In addition, there 
was no trial by response effect noted, with 
the shock responses occurring with approxi- 
mately equal frequencies on each trial. 


Discussion 


The current data lend support to previous 
findings which indicated that an aggressive 
counterresponse, as opposed to other types of 
responses, to an interpersonal provocation is 
accompanied by a relatively rapid return of 
systolic blood pressure to prefrustration lev- 
els. In addition, the same effect appears to be 
manifested in the more rapid vasodilation 
observed in Study II. Emphasis should be 
placed, however, on the fact that these results 
were obtained only with male subjects. 

In reviewing all of the studies carried out 
in this series it becomes apparent that there 
are a variety of circumstances under which 
this “catharsis” phenomenon does not occur: 
with a high status frustrater (Hokanson & 
Burgess, 1962a); with displaced aggression 
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towards a person unrelated to the frustrater 
(Hokanson et al., 1963) ; with fantasy aggres- 
sion (Hokanson & Burgess, 1962b); and in 
the current study, among college-age female 
subjects. An attempt to place these results . 
within a general learning theory framework _ 
may prove instructive. i 

In each instance where the phenomenon: F 
did not occur it can be assumed that the < 
subject had previously learned that under 
these particular social conditions an aggres- 
sive counterresponse to a provocation was not 
an appropriate behavior. That is, aggressive 
behavior will not bring the interpersonal ex- 1 
change to a rewarding conclusion and thus the 
elevated vascular processes are maintained. 6 
Similarly, this view maintains that the sub- 
ject has learned that under certain other con- 
ditions, an aggressive response is instrumental 
in terminating noxious social stimulation— 
thereby also being associated with a relatively 
rapid reduction of autonomic processes. 
Hence this series of studies has in effect been 
identifying the complex discriminitive stimuli 
under which aggressive behaviors have or have | j 
not been reinforced in our culture. [ 

Indirect evidence in support of this view is 
provided in an unpublished thesis by Helen 
E. Mershon at Florida State University. She 
noted in previous studies that neither fantasy 
aggression towards a low status frustrater 
nor direct aggression towards a high status 
frustrater produced the ‘‘catharsis” effect. 
Reasoning that in neither case was there 
much likelihood of these conditions having |; 
been reinforced in the past, she reversed the 
type of aggression and status of object condi 
tions in part of her experiment. Using systolic 
blood pressure as the dependent variable she 
now found that fantasy aggression towards à 
high status frustrater was associated with the 
rapid drop in systolic level, as was the direct | 
aggression towards the low status frustrater. 
Presumably, direct aggression towards an 
object with retaliatory capabilities had not 
proved to be reinforcing in the past, whereas 
covert aggression had, Similarly, fantasy 88° 
gression towards a low status frustrater ha 
not been as successful as direct verbal o . 
physical aggression, 

Within the framework of this discussion, 
the failure to find the catharsis effect with 
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s NR subjects in the present study is more 


comprehensible. It is virtually a truism that 
middle-class females in our culture receive 
little reward or training with respect to physi- 
cal aggression; that is, it has not become in- 

' strumental in terminating noxious social stim- 
ulation, Under these circumstances it is not 
surprising that female subjects, even when 
forced by the limited number of responses 
available in the present study to utilize an 
aggressive counterresponse, manifest no as- 
sociated physiological relief. 

It is recognized that the viewpoint out- 
lined here loses considerable force when the 
female behavioral data in the present paper 
is considered. It will be recalled that the fe- 
males made as many shock counterresponses 
as the male subjects. This observation is at 
variance with the more usual experimental 
finding that males manifest more physical 
aggression than females (e.g., Bandura, Ross, 
& Ross, 1961). It is possible, however, that 
the particular design of the interpersonal sit- 
uation in the present study is a determining 
factor. Here the subject is faced with a con- 
sistently punitive opponent and only three 
possible response keys. It is likely then that 
subjects would utilize all three keys in an at- 
tempt to terminate the shock. 

A more direct test of this approach is cur- 
rently being carried out. Utilizing female 


subjects, a long, preliminary training period 
is instituted in which shock counterresponses 
are made instrumental in terminating a “fel- 
low subjects" aggression. After this aggres- 
sion training, the subject will again be placed 
in a two-person interaction similar to that 
described in this paper, with the expectation 
that the frequency of aggressive responses 
will increase, and more important, that rapid 
recovery of vascular measures following an 
aggressive response will be obtained. 
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4 individual difference variables were studied in a situation requiring S to 
engage in self-description. The variables were: (a) sex of S, (b) sex of E, 


(c) hostility score of S, and (d) hostilit 


description of himself, E and S rated each other's behavior. Observers rated 


the behavior of E during the experime 


sults related to the 4 individual difference variables were obtained. Perhaps 
the most provocative finding was that Es rated by Ss as high in congeniality 
elicited less personally significant material from Ss than did Es with low ratings. 


Recent research on the nature of psycholog- 
ical experiments and the experimenters who 
carry them out has led to at least one salutary 

` effect. It has made clear that the simple model 
of the psychological experiment as a vehicle 
for studying the responses of subjects to dis- 
crete intended experimental manipulations is 
outmoded (Orne, 1962; Rosenthal, 1963; 
Sarason, 1965; Sarason & Minard, 1963; 
Winkel & Sarason, 1964). 

Like the chemist who controls or manip- 
ulates air pressure because of its possible con- 
founding effects on reactions central to his 
interests, so also, the general psychologist 
must attend to the social and attitudinal vari- 
ables involved in persons’ participation in ex- 
periments, For the social psychological and 
personality researcher, of course, attention to 
individual difference and interpersonal vari- 
ables does not come about because they are 
extraneous, confounding factors which re- 
quire isolation. Rather, they are the very 
subject matter of the fields of personality 
and social psychology. 

The present paper reports an experiment in 
which the role of individual difference vari- 
ables was studied in relation to behavior ona 
task widely used in these fields, the task of 
self-description. A person's verbal reports 
can be interpreted as a function of his personal 

characteristics and the characteristics of the 
self-report situation. In the experiment de- 
1 This research was supported by research grants 
from the National Institute of Mental Health, United 


States Public Health Service (M-3889) and from the 
State of Washington Initiative 171 Fund to IGS. 
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S AMONG SUBJECTS AND 
AND SUBJECTS' 


Washington 


y score of E. After S had presented a 


ntal session. Numerous significant fe- 


scribed here both subjects (Ss) and experi- 
menters (Es) had, prior to the experiment, ¢ 
been psychometrically assessed on one indi- 
vidual difference variable, the tendency to 
attribute behavior hostility to oneself. Since 
the content of the behavior hostility question- 
naire dealt with hostility expressed in inter- 
personal settings, it seemed of interest to 
determine the degree to which it might be} 
predictive of behavior in an interviewlike 
interpersonal situation. ‘ 
Another individual difference variable, sex, 
was also studied both for Ss and Es. Thus, ~ 
the experiment involved four individual dif- ` 
ference variables: Ss’ sex and hostility scores, 
and &s’ sex and hostility scores. By relating - 
these variables to Ss’ self-descriptions, it was 
possible to address ourselves to a number of 
questions and problems. Among these were: £ 
1. Do persons differing in hostility inferred d 
from a paper-and-pencil test respond differ- 
ently to the task of self-description in a free 
verbalization situation? This question seemed 
especially interesting since Ss’ hostility scores 
were derived from a set of items focus 
directly on hostility reactions, while the self 
description task in the interviewlike situation 4 
was not structured as dealing specifically - 
with self-perceived tendencies towards hos- 
tility. We 
2. More generally, are the several individual 
difference variables, and combinations x 
them, related to Ss’ verbal behavior? Fo 
example, do male and female Es differ DE 
the verbal behavior which they elicit from - 
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TABLE 1 
ITEMS OF THE BEHAVIORAL HosTILITY SCALE 


. I seldom strike back, even if someone hits me first. 

. I never get mad enough to throw things. 

Isometimes show my anger by banging on the table. 

. If someone hits me first, I let him have it. 

When I am mad, I sometimes slam doors. 

. Even when my anger is aroused, I don't use “strong 

language." 

When people yell at me, I yell back. 

When I really lose my temper, I am capable of 

slapping someone. 

. When I get mad, I say nasty things. 

. When arguing, I tend to raise my voice. 

. I can remember being so angry that I picked up the 
nearest thing and broke it. 

. I often make threats I don't mean to carry out. 

. If I have to resort to physical violence to defend 
my rights, I will. 

. I have had several bitter arguments with my father. 

. [have had several bitter arguments with my mother, 

. I am often so annoyed when someone tries to get 

plead of me in a line of people that I speak to him 

about it. 
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Ss? Do high and low hostile Zs elicit different 
types of verbal responses from Ss? 

3. Are the individual difference variables 
Studied here related to Es’ perceptions of Ss 
and Ss’ perceptions of Es? In order to shed 
light on this question, after each experimental 
session E and S described and reacted to each 
other in terms of a series of rating scales. 

4. If Es differing in sex and hostility are 
found to have different influences over Ss’ 
verbal behavior, how might these be ex- 
plained? In order to provide suggestions con- 
cerning one possible explanation, it was ar- 
Tanged to have each experimental session 
observed by persons whose task it was to 
record occurrences of particular types of 
responses emitted by E. If Ss’ self-descrip- 
tions were not comparable for different types 
of Es, it seemed likely that records of Es’ 
Overt behavior in the experimental session 
Might provide clues to the bases of their in- 

uences over Ss. 

This experiment, thus, was concerned with 
One organismic variable, sex, one test-inferred 
Variable, hostility, and the relationships of 
these to Ss’ verbal behavior, Zs’ and Ss’ ex- 
Pressed reactions to each other, and Es’ be- 
havior as rated during the experimental ses- 
Sion. Study of these relationships was viewed 
85 à contribution to the analysis of the experi- 
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mental interaction conceived as a significant 
interpersonal event. 


METHOD 
Subjects 


The Ss for the present study were drawn from 
introductory psychology classes at the University 
of Washington. Prior to the experiment, they had 
been given a questionnaire in which the Behavioral 
Hostility (BH) Scale was included. This Scale con- 
sists of 16 items each of which refers to a specific 
type of overt hostile response. Table 1 contains 
the items. 

The BH score distribution was divided into the 
upper and lower quartiles and Ss were selected from 
these extremes. The low BH group had scores be- 
tween O and 4, and the high BHs had scores be- 
tween 9 and 15. Equal numbers of males and fe- 
males were included within the experimental design. 
Each group represented in the experimental design 
(2X2X2X2) contained three Ss, making a total 
of 48 Ss altogether. The Ss were informed of their 
appointments through telephone calls made by GHW. 


Experimenters 


A total of 16 Es were used in the study. Their 
names were selected on a random basis from the 
records of undergraduate students who had taken 
the BH Scale in previous quarters. In no case was 
the period between the time the E took the BH 
Scale and the time he was contacted to serve as an 
E greater than three quarters. No student was 
selected as an E who was over 21 years of age. 
None of the Es were psychology majors. 

Of the 16 Es, 8 were female and 8 were male. 
These Es, who were volunteers, were selected from 
the same score ranges as the Ss. Thus, high scoring 
Es were chosen from the range 9-15, and low scoring 
Es were selected from the range 0-4. Each E was 
paid $8 for his or her cooperation. 

Prior to running Ss, the individuals who indicated 
willingness to serve as Es were seen in groups of 
two and three and told about the procedure in- 
volved in carrying out the experiment. They were 
instructed about such matters as the manner of 
introducing themselves to Ss, and the use of the 
tape recorder, and then were shown copies of an S 
rating sheet and told how to fill it out. Three Ss 
were run by each E and assignment of Es to Ss 
was random within the restrictions of the experi- 
mental design. 


Observers 


Ten students (voluntéers) from upper division 
psychology classes at the University of Washington 
were used to observe the experimental sessions. A 
pair of observers was present for each sesssion. For 
two sessions, they observed experimental sessions 
which were not actually part of the study proper 
and rated the Es on a series of rating scale items. 
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The third session was used to obtain information 
on the reliability of some of the items which were 
included. 


Reliability of Observers’ Ratings 


The reliability of the scales used by the observers 
of the Es was obtained in the following manner: 
each item was rated on a scale from —10 to +10, 
the raw scores of the two raters observing each E 
were obtained and compared. If the raw scores were 
equal in the case of the two observers, or were one 
point off in either direction, a 1 was recorded. If the 
scores differed by greater than one point, a O was 
counted. The “1s” and “Os” were counted to obtain 
an indication of agreement. This was divided by 
the total number of possible agreements. The result 
was a percentage of agreement score. After items 
with low levels of agreement were eliminated, 14 
items remained. These items had to do with such 
aspects of E's behavior as postural movements, 
gestures, smiling, and amount of verbalization. 

Each experimental session lasted 12 minutes. The 
observers were told that they were to observe the 
Es for a period of 1 minute and then rate them on 
the 14 items. They were then to observe for another 
minute and again rate the Es on the 14-item sheet. 
As a result of this procedure it was possible to 
Obtain seven ratings for each E during the course 
of time that he was running his S. All observations 
were done through a one-way mirror. E and S were 
unaware that they were being observed. 


Experimental Room and Apparatus 


The room in which the experiment was conducted 
measured 4 X 8 feet and contained a table and two 
Chairs. On one side of the room was a one-way 
mirror for observation purposes. The table was ar- 
ranged so that the E was clearly visible from the 
other side of the mirror. On a chair by the E's side 
was a Wollensak T-1500 tape recorder with a 
microphone attachment. 


Procedure 


After meeting S, E brought him to the experi- 
mental room, gave him the instructions and handed 
him the microphone from the tape recorder. The 
Ss were told that the purpose of the experiment was 
to find out how students think and feel about 
themselves. Prior to the time that E met S, the 
Observers were installed in the observation booth 
and were seated facing E. 

If the S did not speak for 10 Seconds, two 
“prodding” remarks were made by E: 


Can you think of anything else? 
Please tell me more. 


At the end of the session GHW went to the experi- 
mental room and asked the S to fill out a rating 
sheet on the E. 
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Dependent Measures 


Speech contént categories. The Ss’ verbalizations ' 


were divided into a number of categories for purposes 
of content analysis (Sarason & Ganzer, 1962). The 
following categories were used: 


1. Positive Self-References (PSR). Self-references ‘ 
reflecting favorably on S (e.g, “I am well-liked by” 


other people”). 

2. Negative Self-References (NSR). Self-references 
reflecting unfavorably on S (e.g, “I have great 
fears about meeting new people"). 

3. Ambiguous Self-References (ASR). Self-refer- 
ences not classifiable as either favorable or un- 
favorable to S (e.g., “My major is chemistry”). 

4. Positive Other-References (POR). References 
about others which reflect favorably on them (eg. 
“My girl friend is very understanding”). 

5. Negative Other-References (NOR). References 
about others which reflect negatively on them (e.g, 
*My fraternity brothers are stupid"). 

6. Ambiguous Other-References (AOR). References 
about others not classifiable into either positive or 
negative categories (e.g., “My brother likes sports"). 

7. Father References. References to father (eg, 
“My father likes the outdoors"). 

8. Mother References. References to mother. 

9. Peer References. References to unrelated others 
who are approximately in the same age group as $ 
(e.g, “My friends are a good bunch”). 

10. Asking Directions of the E. Any statements 
which involve asking the E about how the task is 
to be performed (e.g. “How am I supposed to talk 
about myself?”). ` 

11. Present-Tense References (e.g, “He is a nice 
person”). 

12. Past-Tense References (e.g, “When I was in 
high school, I thought that college would be easy”). 

13. Future-Tense References (e.g. “I will be more 
mature in a few years"). 


Speech disturbance categories. Several of the speech 
disturbance categories used in this study were com- 
parable to those developed by Mahl (1956). The 
formal aspects of speech (to be distinguished from 
the content of speech) included these disturbance 
categories: 

1, “Ah.” Whenever the definite “ah” sound (as 
distinguished from “er” or *um") occurred, it was 
scored (eg, “Well... ah . . . when I first came 
home, I didn’t have a job”). 

2. Sentence Correction. An interruption in the 
word-to-word progression of sentences (e.g, "The 
main reason that I don't... didn’t like him was 
that he was always complaining”). 

3. Sentence Incompletion. An interrupted €x- 
pression clearly left incomplete (e.g., “My mother 
eed to be . . . he was trying to get away with 
it”). 

4. Repetition. The serial repetition of one or more 
words; usually one or two words (e.g, “Then he 
was ... he was drafted into the army"). 
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5. Stutter. The rapid and repeated repetition of a 
syllable or letter in a word (e.g, "Well, then he 
we.we.we.we.went homei): x 


Reliability of the Speech Content and Speech 
Disturbance Categories i 
The overall interrater reliability of both sets of 
categories was based upon five interviews selected 
randomly. The mean reliability coefficient for the 


speech content categories was .90 and for the speech 
disturbance categories, .92. 


RESULTS 


Because of the large number of significant 
results of the analyses of variance performed, 
triple- and fourth-order interactions will in 
most cases simply be noted. 


Content Analysis 


An initial analysis was carried out to de- 
termine the relationship of the individual dif- 
ference variables to the number of statements 
of all types made by Ss. It was found that 
high BH scorers made more statements than 
low BH scorers (Ms= 75.04 and 61.42, 
respectively, p < .05). 

Two effects reached the .05 level in an 
analysis of the number of positive self-refer- 
ences. One was for the differences between Ss 
who had been paired with male and female Es 
and the other was for the BH of E X BH of 
S X Sex of S interaction. The main effect was 
due to a greater number of PSRs from Ss run 
by male Hs than from Ss run by female Es 
(Ms = 10.58 and 6.79, respectively). 

Four Fs were significant in the analysis of 
variance for negative self-references, The F 
for sex of E was due to a greater number of 
NSRs for Ss run by male than by female Es 
(Ms = 13.46 and 9.25, respectively, p< 
025). The F for BH of S was due to a greater 
number of NSRs by high than by low scorers 
(Ms = 13.71 and 9.00, respectively, ? < .01). 
The F for sex of S was due to a greater num- 
ber of NSRs by female than by male Ss (Ms 
= 13.17 and 9.54, respectively, p < .05). The 
Sex of E x Sex of S interaction (p < .005) 
Was due to the fact that female Ss made 
Slightly more NSRs than did male Ss when 

was a female (Ms -— 12.17 and 10.08, 
Tespectively), but when E was a male, female 
Ss made a great many more NSRs than did 
male Ss (Ms = 15.52 and 7,92, respectively). 
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Only one F, that for the BH of E X Sex 
of E interaction, attained significance in the 
analysis of ambiguous selí-references (5 < 
.01). This interaction was due to the fact 
that when Ss were paired with low BH Es, 
they made more ASRs if E was a female 
than a male (Ms — 51.42 and 33.33, respec- 
tively), and when they were paired with 
high BH Es, they made more ASRs if E was 
a male than a female (Ms — 48.67 and 43.58, 
respectively). : 

In the analysis of positive and negative 
reference to others, only one F reached the 
.05 level. In each case it was the effect for 
BH of S. High BH Ss made more positive 
references to others than did low BH Ss 
(Ms = 1.2 and .46, respectively). They 
also made more negative references to others 
than did low BH Ss (Ms = 2.63 and .79, 
respectively). 

Two Fs were significant at the .07 level in 
the analysis of ambiguous references to others. 
One was for BH of E X BH of S X Sex of S. 
An F for sex of E revealed that Ss paired 
with female Es tended to make more ambigu- 
ous statements than did those paired with 
male Es (Ms — 7.46 and 3.04, respectively). 

None of the Fs in the analyses of references 
to mothers and fathers reached conventional 
significance levels. There were, however, two 
significant Fs in the analyses of references to 
peers. These were for BH of S (p< .05) 
and the BH of E x BH of S X Sex of S inter- 
action (p< .05). The main effect was due 
to more peer references given by high than by 
low BH Ss (Ms= 3.63 and 1.17, respectively). 

There were no significant differences among 
the various groups in the tendency to ask E 
for further directions during the experimental 
session. 

The analysis of present-tense references 
revealed five significant Fs. Significant at the 
.05 level was the difference between groups 
paired with high and low BH Es. The Ss 
paired with high BH Zs made more uses of 
the present tense than did low BH Es (Ms 
= 59.63 and 44.92, respectively). The sig- 
nificant BH of E X Sex of S interaction (p 
< .05) arose because male and female Ss 
responded differently to Es differing in BH. 
Male Ss made more present-tense references 
when paired with low BH Es than did female 
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Ss (Ms- 48.08 and 41.75, respectively). 
However, female Ss made more present-tense 
statements than did male Ss when E was 
high in BH (Ms = 67.33 and 51.92, respec- 
tively). 

There were three significant triple inter- 
actions. There were the effects for BH of 
EX Sex of E x BH of S (p < .05), BH of 
E x BH of S x Sex of S (p < .05), and the 
interaction involving all four individual differ- 
ence variables (p < .025). 

The analysis of the number of past-tense 
references yielded three significant results. 
One of these was due to sex of E (p < .05). 
The Ss paired with female Es used the past 
tense more than twice as often as did Ss 
paired with male Es (Ms — 15.79 and 6.58, 
respectively). The variable of BH of S was 
also significant (p< .025). High BH Ss 
made more statements in the past tense than 
did low BH Ss (Ms= 16.92 and 5.46, re- 
spectively). There was a significant BH of 
E x Sex of E x BH of S interaction (p < 
025). 

The analysis of variance future-tense state- 
ments failed to reveal any significant effects. 


Speech Disturbances and Instrusions 


An overall analysis was performed on the 
sum of the following measures: numbers of 
“ah’s,” sentence corrections, sentence incom- 
pletions, repetitions, stutters, and laughs. This 
analysis did not reveal any significant Fs. 
However, several significant results were un- 
covered when individual categories were ex- 
amined. For example, it was found that males 
emitted more “ah’s” than did females (Ms 
= 21.08 and 12.79, respectively, p < .01), 
and that high BH Ss made more sentence 
corrections than did low BH Ss (Ms = 15.33 
and 10.96, respectively, p < .05). 

Two interactions were found to be sig- 
nificant (p < .025 in both cases) in an analy- 
sis of sentence incompletions. One of these, 
that for BH of E X Sex of E, was due to the 
fact that Ss paired with low BH Es had fewer 
incompleted sentences when E was male, but 
for Ss paired with high BH Es just the op- 
posite was the case. The other interaction was 
that for Sex of E X Sex of S. Female Ss 
made more incompleted sentences than male 
Ss when E was a male, but vice versa was the 
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case when E was a female. The highest order 
interaction in the analysis of incomplete sen- 
tences was also significant (p < .025). 

The analysis for the frequency of laughter 
by Ss throughout experimental sessions re- 
vealed six significant Fs. The F for BH of E 
was due to a much higher frequency of laughs 
by Ss paired with low BH Es than with high 
BH Es (Ms=7.33 and 2.79, respectively, 
p < .01). Male and female Ss differed in that 
females laughed more frequently than did 
males (Ms=7.50 and 2.63, respectively, 
p<.0l). 

One of the significant interactions, that 
for BH of E x Sex of E (p < .025), resulted 
from more laughter by Ss paired with female, 
low BH Zs than those paired with male, low 
BH Es, but less laughter by Ss paired with 
female, high BH Es than by Ss paired with 
male, high BH Es. The BH of E x BH of S 
interaction (p < .05) was due to the fact that 
the group of high BH Ss paired with low BH 
Es laughed more than twice as frequently 
than did all other groups involved in the 
interaction. The BH of E x Sex of E X BH 
of S, and the BH of E X Sex of E X Sex of S 
interactions were significant (p < .025 and f 
« .05, respectively). 

Because of the many analyses to which we 
have referred, it would perhaps be useful to 
summarize some of the verbal behavior find- 
ings which seem to be especially noteworthy: 

1. High BH Ss were more productive, that 
is, emitted more statements, than were low 
BH Ss. This difference seemed to be due 
particularly to a greater number of evaluative 
statements by high BH than by low BH SS. 

2. Sex of E seemed to influence a number 
of verbal response categories. In general, Ss 
paired with male Es made more personal 
references, fewer ambiguous, and fewer past- 
tense statements than Ss paired with female 
Es. Male Ss paired with male Es gave fewer 
NSRs than any other group. 

3. Female Ss laughed more frequently dur- 
ing the interviewlike session than did male SS. 

4. The Ss paired with high BH Es laughed 
less frequently and used the past and present 
tenses more frequently than did Ss paired 
with low BH Es. The Ss run by male high BH 
Es gave fewer ambiguous responses than any 
other group. 
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Ss’ Reactions to Es 


The questionnaire which each S filled out 
contained items of two types. One dealt with 
E’s body movements and gestures and the 
other with S’s impressions of E as a person. 
Since almost all of the results concerning Es 
body movements and gestures took the form 
of not readily interpretable higher order inter- 
actions they will not be presented in detail 
here. 

A number of consistencies emerged in 
studying Ss’ impressions of the Es with whom 
they had contact. These seemed to center 
around Ss’ evaluations of E. The Ss’ ratings 
of male and female Es differed on the follow- 
ing items (mean ratings of female Es are 
presented first, and the means of male Es 
second): 


1. E's friendliness (Ms — 17.97 and 15.81, 
$< 01) 

2. E’s enthusiasm (Ms = 13.03 and 10.22, 
$ «.01) 

3. E's professional manner (Ms — 13.38 
and 11.25, p < .05) 

4. Degree to which E was encouraging (Ms 
= 15.28 and 13.00, p « .025) 

5. E's pleasantness (Ms — 18.31 and 17.03, 
p< .05) 

A constant was added to Ss’ original ratings 
to achieve a scale ranging from 0 to +20. 

In addition to these differences, female Es 
were described as having more expressive 
facial characteristics than male Es, and male 
Es were rated as speaking significantly more 
slowly than female Es. 

Many of the same items as above were 
responded to differently by male and female 
Ss (females’ means appear first, followed by 
males’ means) : 


1. Degree to which E was liked (Ms= 
17.28 and 15.31, p < .025) 

2. E’s interest in S (Ms = 15.57 and 13.95, 
$ < 01) 

3. E's courteousness (Ms = 19.22 and 
17.53, p < .01) 

4. Degree to which E's voice was pleasant 
(Ms = 18.10 and 16.06, p < .01) 

5. Degree to which E was encouraging (Ms 
= 15.28 and 13.00, p < .01) 

6. E's pleasantness (Ms = 18.37 and 16.97, 
P< 025). 
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From these results it would appear that Ss 
made more favorable or positive evaluations 
of female than male Es and that female Ss 
rated Es with whom they were paired more 
favorably than did male Ss. In drawing 
these general inferences, however, it is neces- 
sary to be aware that there were also many 
significant interactions among the four indi- 
vidual difference variables. Consider, for ex- 
ample, Ss’ ratings of Es’ degree of friendli- 
ness. As mentioned earlier, female Es were 
rated as being more friendly than male Es. 
There were, however, four other significant 
effects in the analysis of ratings of Es’ friend- 
liness. 

One of these was the Sex of E X Sex of S 
interaction (p < .05). Male Ss tended to 
rate male Es as more friendly than female Es; 
vice versa was the case for female Ss’ ratings 
of Es. Another significant F was that for BH 
of E X Sex of S (p < .025). Male Ss gave 
higher friendliness ratings of low BH Es than 
did female Ss; vice versa was the case for 
ratings of high BH Es. The BH of S x Sex 
of S interaction lp < .025) arose because 
female, low BH Ss rated Es more friendly 
than did male, low BH Ss. On the other hand, 
male high BH Ss rated Es as more friendly 
than did female high BH Ss. Finally, the BH 
of E X Sex of E X Sex of S interaction was 
significant at the .001 level. 

Several significant interactions tended to 
recur from analysis to analysis. This was true, 
for example, for the Sex of E x Sex of S 
interaction. We have seen that male Ss made 
higher friendliness ratings for male Es than 
did female Ss. On the other hand, female Ss 
gave higher friendliness ratings of female Es 
than male Ss. Rating scale items dealing with 
the degree to which E was personal, relaxed, 
and casual showed significant differences in 
exactly the same direction. 

Another recurring interaction was the one 
for BH of E X Sex of S. On six items, those 
relating to E's friendliness, enthusiasm, in- 
terest, vocal expression, encouragement of S, 
and pleasantness, the group which gave the 
Es the most favorable ratings was that con- 
sisting of female Ss paired with high BH Es. 
The BH of S x BH of E, also, showed a 
consistent trend. For items related to E's 
casualness, quietness of voice, interest, be- 
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havioral consistency, and pleasantness, rat- 
ings by Ss of Es dissimilar to them in BH 
were always higher than the ratings by Ss 
of Es with BH scores similar to them. 

The only readily apparent consistencies in 
Ss’ ratings of Es’ gestural and expressive be- 
havior was found in the BH of E X Sex of 
E interactions. Significant consistencies were 
found for the following items: E’s use of 
hand gestures, arm gestures, head gestures, 
trunk movements, leg movements, and body 
movements in general. In every case Ss paired 
with female high BH Es described their Es 
as exhibiting more gestures and movements 
than all other combinations of the BH of E 
and Sex of E variables. 

It would appear from these data that: 
female and male Es were perceived differently 
by Ss, male and female Ss differed in their 
perceptions of Zs’ behavior, and various com- 
binations of the individual difference variables 
were related to the ratings of Es by Ss. 


Es Ratings of Ss 


Although there were many significant inter- 
actions in the analyses of Es’ ratings of Ss, 
there were fewer item-to-item consistencies 
among them than was the case for Ss ratings 
of Es. One consistency which did emerge 
involved the BH of S X Sex of S interaction. 
For items dealing with the degree to which 
S was liked, was courteous, and behaved 
consistently, male low BH Ss always re- 
ceived the lowest ratings of the four groups 
involved in the interaction. In the case of the 
BH of EX Sex of E interaction, for items 
dealing with S's interest, enthusiasm, and 
courteousness, male low BH Es and female 
high BH Es gave consistently higher ratings 
to Ss than did male high BH and female low 
BH Es. 

Two major consistencies in main effects 
were observed in the analyses of Es’ ratings to 
Ss. As in the case of Ss’ ratings of Es, these 
involved sex of E and sex of S. For sex of E, 
female Es rated Ss significantly higher on the 
following items than did male Es: the degree to 
which S was personal in the experimental set- 
ting, and the degree to which he or she was 
courteous, businesslike, and had a pleasant 
voice. Female Es also reported more hand 
gestures by Ss than did male Es. For sex of S, 
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Es rated female Ss higher than male Ss on ' 


these items: liking of S, S’s enthusiasm, S’s 
businesslike air, and body movements. 

One additional finding of note was that 
low BH Es described Ss as being significantly 
more likable and more consistent in their be- 
havior than did high BH Es. 


Observers’ Ratings of Es 


The most consistent findings in analyses 
of observers’ ratings were significant differ- 
ences in ratings of the behavior of male and 
female Es. Female Es were rated as looking 
at Ss, smiling, and nodding in agreement 
more frequently than male Es. Female Es 
were also described as making more hand 
gestures and placing their hands near their 
faces more frequently than male Es. Male Es 
were described as fidgeting more than female 
Es and manipulating objects with their hands 
more than did female Es. 

Low BH Es were described by observers as 
nodding agreement and looking at Ss more 
frequently than did high BH Es. The Es who 
were paired with female Ss were rated as 
looking more frequently at Ss, and fidgeting 
and leaning forward less than Es who were 
paired with male Ss. 

There were four significant differences be- 
tween Es paired with high and low BH Ss. 
Those paired with low BH Ss were described 
by observers as smiling more frequently and 
leaning forward more frequently than Es 
paired with high BH Ss. On the other hand, 
Es paired with high BH Ss were described as 
fidgeting and manipulating objects with their 
hands more than were Es paired with low 
BH Ss. 

Significant BH of S X Sex of S interactions 
were found on four ratings scale items. These 
were due to the facts that Es paired with 
male high BH Ss were described as looking 
less frequently at Ss, and fidgeting, manipulat- 
ing objects with hands, and placing their 
hands near the face less frequently than low 
BH Ss. Four significant Fs were also found 
for the Sex of E X Sex of S interaction. The 
group of female Es paired with male Ss were 
described as smiling, leaning forward, manipu- 
lating objects, and using hand gestures more 
frequently than all other groups involved in 
this interaction. 
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Two other interactions seem worth noting. 


Č A significant (p < .001) BH of S X BH of 


E interaction was found for the frequency 
with which E looked at S. The mean rating 
of this frequency for cases in which Ss and 
, Es were either both high or both low in BH 

was 3.8. Where Ss and Es had different BH 
scores the mean was 6.2. The frequency with 
which Æ looked at S was significant also as a 
Sex of E x BH of S interaction (p < .01). 
Female Es paired with low BH Ss looked at 
Ss more frequently than did all other com- 
binations of the sex of E and BH of S vari- 
ables. A significant difference in the same 
direction for this interaction was found for 
the frequency with which Æ smiled at S. 

It would appear that the pairings of E and 
Ss do affect the behavior of Es as rated by 
observers. The most consistent differences, 
however, in this study arose when male and 
female Es and Es paired with high and low 
BH Ss were compared. 


Discussion 


Our results add weight to previous findings 
indicating that personal characteristics of 
both Ss and Es influence Ss’ behavior in 
experimental situations (Rosenthal, 1963; 
Sarason, 1962; Winkel & Sarason, 1964). 
The self-description situation studied here is 
different from previously reported demonstra- 
tions of these relationships. As more and more 
data of this type come to the fore for a 
broadening variety of situations the necessity 
for researchers interested in behavior to at- 
tend to the social psychology of psychological 
experiments becomes increasingly clear. 

Tn the present study, self-descriptions were 
clearly related to sex and hostility scores of 
Ss. High BH scorers made a greater number 
of statements that did low BH scorers and 
females made a greater number than did 
males. These differences were attributable to 
a considerable extent to the tendency of high 
BH and female Ss to make more evaluative 
Statements about themselves than did low BH 
and male Ss. The differences are interesting 
in light of the task which was presented to 
Ss, namely, to describe their attitudes and 
how they feel about themselves. While high 
BH and female Ss might be viewed as having 
a greater tendency to conform more fully to 
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task demands than low BH and male Ss, the 
results are also compatible with the view that 
females and high scorers on hostility question- 
naires are more able than other persons to 
evaluate themselves publicly, openly, and 
frankly. 

Sex and hostility scores of Es also in- 
fluenced Ss’ self-descriptions. Again, bearing 
in mind the task as presented to S in the 
experimental session, it is interesting that Ss 
paired with male Zs made more evaluative 
personal references than did Ss paired with 
female Es. Furthermore, high BH Es tended 
to elicit from Ss significantly fewer ambigu- 
ous self-references than did low BH Es. Thus, 
it would appear that Ss paired with high BH 
and male Es were more successful in perform- 
ing the self-description task than were Ss 
paired with low BH and female Es. 

That E and S characteristics have combina- 
tive influences on self-description was seen in 
the many significant interactions which were 
uncovered. An example of these is a significant 
Sex of S x Sex of E interaction for negative 
self-references which was attributable to a 
greater number of NSRs made by female Ss 
paired with male Es than by any other group. 
Individual differences associated with Ss and 
Es and combinations of these individual dif- 
ferences are significantly related to persons' 
self-descriptions. 

The results of this experiment showed, 
also, that personal characteristics of E and S 
are significantly related to their descriptions 
and perceptions of each other. This was very 
clearly true for the sex variable. In general, 
Ss saw female Es as more friendly, pleasant, 
and enthusiastic than male Es. Also, female 
Ss tended to say that they liked their Zs more 
than did male Ss. Female Es tended to de- 
scribe more favorably Ss with whom they had 
paired than was the case for male Es for 
their Ss. Female Ss tended to be described 
more favorably by Es than were male Ss. 

It seems clear that females, be they Ss or 
Es, tend to be described more favorably than 
males. While this consistency is of considerable 
interest and might provide a basis for numer- 
ous hypotheses, it seems especially necessary 
to relate the findings concerning persons' de- 
scriptions of others to their descriptions of 
themselves. Our data suggest that, at least 
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under the present conditions, the more favor- 
ably Ss describe Es the less satisfactorily will 
they carry out the self-description task. Al- 
though female Zs were described more favor- 
ably by Ss than were male Es, Ss paired 
with male Es made more personal references 
and fewer ambiguous references than did Ss 
paired with female Es. 

A similar finding obtained for the variable 
of hostility scores. While Es who were high 
BH scorers tended not to be described as 
favorably as low BH scorers, they, neverthe- 
less, tended to elicit more personally meaning- 
ful material from Ss. Listening to the tapes 
of Ss paired with high and low BH Es cor- 
roborated these statistical findings. One, then, 
is left with the impression that a person’s de- 
gree of comfort with and liking for another 
person may not be indicative of a willingness 
to “open up.” This sort of observation has 
been made frequently in the area of psycho- 
therapy where it often happens that patients’ 
progress may not be dependent on their ap- 
parent levels of rapport with therapists (Sara- 
son, 1954; Strupp, 1962), Our results suggest 
that personal compatibility and congeniality 
as traditionally defined may not be required 
to elicit personally relevant material. 

The differences in descriptions by Ss of 
male and female and high and low BH Es 
were in a sense corroborated by the ratings 
of Es by observers. These observations also 
suggested that Es’ behavior during experi- 
mental sessions was very much influenced by 
the characteristics of Ss with whom they were 
paired. This indicates that to attend to E as 
an influence over S is only half the story. 
The influence of S over E's behavior would 
seem psychologically of equal significance. 

The results of our experiment have rele- 
vance to several areas within the field of 
personality. Mention has already been made 
of their similarity to some observations re- 
ported in the psychotherapy literature. The 
discrepancy between ratings of others and 
self-descriptions would seem of interest to 
students of person perception and empathy 
between persons. Finally, they seem poten- 
tially explicable in terms of the concepts of 
modeling and imitation (Bandura & Walters, 
1963). Why do apparently more congenial Es 
elicit less personally significant material from 
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Ss than do apparently less congenial Es? 
One possibility that comes to mind is that Es 
relatively high in congeniality and the tend- 
ency to make culturally approved social re- 
sponses may wittingly or unwittingly be 
functioning as models for other persons. 
Socially conventional and approved behavior 
in an E may, as it were, rub off on the S 
with whom he comes into contact. Similarly, 
the tendency for Ss paired with Es who are 
less socially adept to more fully carry out the | 
task of introspection may come about as a 
result of cues provided by E that suggest that 
social desirability or conventionality need not 
be adhered to as a guiding principle during 
the experimental session. 

As we implied earlier, generalizing from 
the present findings to situations which do not 
involve self-description may be done cau- | 
tiously. For the task of self-description, the 
results do demonstrate the need for social 
psychological approaches to self-description 
and interview situations. Relating these re- 
sults to prior experiments on E-S variables, 
it seems inescapable that the limits of the 
impact of these variables in psychological 
research generally have as yet not been 
demarcated. 
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By manipulation all Ss were marginally accepted by the group, relatively un- 
attracted to the group, and disagreed with the group on an important issue. 
All had their opinion tested on 3 occasions, the last 2 of which were “private” 
testings. Experimental Ss were committed to continue in the same group; con- 
trol Ss were not, In 2 experimental conditions, each S had a deviate-ally in the 
group, ie, someone who agreed with S's initial opinion. In the 3rd experi- 
mental condition, no deviate existed. When S knew the deviate existed prior 
to the 2nd opinion measure, there was no opinion change. However, knowledge 
of the deviate's existence after opinion change had little effect. When no 
deviate existed commitment increased opinion change. 
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Textbooks in social psychology usually con- 
clude that the less attracted a person is to a 
_ group, the less influence the group may exert 
over him, Indeed, Hare (1962) in a recent 
review of the literature listed 15 references to 
support this conclusion, Thus, the group 
should be least able to influence the least at- 
tracted member. 

However, more recent evidence (Kiesler, 
1963; Kiesler & Corbin, 1965) indicates that 
this is not always the case. Kiesler and Cor- 
bin, in a study in which both attraction and 
disagreement were manipulated, demonstrated 
that when subjects were not committed to 
continue with the same group, there was the 
typical monotonic relationship between at- 
traction to the group and opinion change as 
a result of disagreement with the group. The 
less the subject was attracted, the less he 
changed his opinion in the direction of that 
presumably held by the group. However, 
when the subjects were committed to continue 
in the same group, there was a nonmonotonic 
relationship between attraction and opinion 
change (privately expressed). That is, com- 
mitment to continue increased opinion change 
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in the low-attraction condition, but not for 
other levels of attraction. The main concern 
of the present paper is to further explore the 
psychological parameters of opinion change in 
committed, low-attracted persons. 

By manipulation, the subject must con- 
tinue interacting in a group by which he is 
only marginally accepted, to which he is rela- 
tively unattracted, and with which he subse- 
quently finds himself in disagreement on an 
important topic or norm. Kiesler and Corbin, 
in their analysis of the committed, low-at- 
tracted subject, imply that the subject’s first 
impulse is to reject the group psychologically, 
but that rejection or devaluation of the group 
is very difficult when one is committed to 
continue the interaction over several meet- 
ings. The subject is presumably in an UD 
comfortable and conflictful state. One way out 
is to change his opinion, to be more like the 
group. One implication of this change is that 
in addition to reducing the discrepancy be- 
tween the subject and the group, it may also 
increase the subject's attractiveness to other 
group members (cf. Walker & Heyns, 1962, 
and Jones, 1964, on the instrumental aspects 
of opinion change; and Heider, 1958, on the 
effects of opinion similarity). This line of ar- 
gument is indirectly supported by both the 
Kiesler and Kiesler and Corbin experiments 
which indicate that the committed, low-at 


tracted subject does indeed change his opinion 


toward that held by the group. 
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In addition, the data suggest that- this 
effect is not simply compliance without pri- 
vate acceptance (Festinger, 1953), but that 
the change reflects a basic realignment of 
cognitions about the attitude object. In both 
of our experiments, the second attitude meas- 
ure was a private one, which the group pre- 
sumably would never see. Thus, there was no 
reason for the subject to “put his best foot 
forward" or overtly conform (as in Asch, 
1956). 

The present experiment tested the stability 
: of the effect and further explored the psycho- 
logical state of the committed, low-attracted 
subject. Both aspects were tested by utilizing 
a deviate-ally in the group; that is, a person 
who agreed with the subject's initial opinion. 
In the present experiment, we varied the 
time when the subject ascertains that the 
deviate-ally exists. That is, we varied 
whether the subject knew the deviate-ally 
existed before or after the subject had an 
opportunity to change his opinion. For ex- 
ample, suppose the committed, low-attracted 
subject has already changed his opinion. Will 
the subsequent knowledge that a deviate ex- 
ists (someone who is advocating the subject’s 
initial opinion) induce the subject to change 
back to his initial opinion, or is this a cogni- 
tive change stable enough to resist subse- 
quent attack? Kiesler has implied that the 
effect is stable and should be resistant (but, 
of course, not impervious) to subsequent at- 
tack. We may test the stability of the effect 
by allowing the subject to ascertain, subse- 
quent to opinion change, that a deviate-ally 
exists in the group. On the other hand, 
knowledge of the existence of a deviate-ally 
Prior to opinion change should have very 
different effects. That is, we have argued that 
Opinion change is one way out of a conflict- 
ful state for the subject in this situation. 
But a deviate-ally should reduce this conflict. 
If this is true, then knowledge that the devi- 
ate exists prior to opinion change should les- 
sen the conflict and reduce the necessity for 
change, 

To test these ideas, the present experiment 
has four conditions. By manipulation, sub- 
Jects in all four conditions were marginally 
accepted by the group, were relatively un- 
attracted to the group, and disagreed with the 
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group on an important issue. All had their 
opinions tested on three different occasions, 
the last two of which were “private” testings. 
These last two measures reflect what we shall 
refer to as Opinion Change; (Ci) and 
Changes (Cs), respectively. The subjects in 
the three experimental conditions were com- 
mitted to continue interacting with the group 
over four different occasions; the uncom- 
mitted-control subjects had to continue with 
the project, but not necessarily with the 
same group. The three experimental condi- 
tions differed with regard to whether a deviate 
existed, and when the knowledge of the 
deviate was acquired. In one experimental 
condition (no deviate, or ND) there was no 
deviate, and also none in the uncommitted- 
control condition. In a second experimental 
condition (early deviate, or ED), the exis- 
tence of the deviate was known prior to the 
second opinion measure, and hence presum- 
ably prior to the resolution of the opinion 
discrepancy between the subject and the 
group. In the last experimental condition 
(late deviate, or LD), knowledge of the devi- 
ate's existence came after the second opinion 
measure, and hence aíter the subject had re- 
solved the opinion discrepancy between him- 
self and the group. The hypotheses are listed 
below. 


Hypotheses for Opinion C, (Second Opinion 
Measure minus First Opinion Measure) 


1. The ND condition will evidence greater 
opinion change than will the control condi- 
tion. The only procedural difference between 
the ND and Control conditions is that the 
ND subjects are committed to continue with 
the same group. Kiesler and Corbin found 
that subjects committed to continue showed 
greater opinion change than subjects not so 
committed. Thus this comparison provides a 
replication of the Kiesler and Corbin experi- 
ment, as well as a base line for comparison 
with other opinion change. 

2. The ND condition will show greater 
opinion change than will the ED condition. 
This hypothesis is based on the assumption 
that the presence of the deviate-ally greatly 
reduced the pressure for change in the ED 
condition. There are three strong reasons for 
this reduction in the pressure to change for 
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ED subjects. One, the presence of the devi- 
ate reduces the potential effectiveness of the 
subject changing his opinion. If the subject 
changed, he still would not agree with all of 
the group, and there would always be one 
person in the group to point out the logic of 
the former position. Two, the subject in the 
ED condition has one bit of important infor- 
mation that the subject in the ND condition 
does not, viz., the group is not unanimous in 
their opinion, Three, the fact that the devi- 
ate agrees with the subject provides some 
consensual validation for the subject's posi- 
tion, Each of these reasons by themselves 
should be effective in reducing the pressure 
to change. Since our simple manipulation 
should make all three salient, it should have 
a powerful effect. This leads us to speculate 
that there may be less change in the ED con- 
dition than in the uncommitted-control con- 
dition. This speculation is based on the 
somewhat unsophisticated assumption that 
the three things reducing pressure to change 
in the ED condition should be more effective 
than the one thing (decrease in commitment) 
in the uncommitted-control condition. The 
speculation appears to be reasonable and is 
stated formally as Hypothesis 2a. 

2a. Based on the argument above, we pre- 
dict that there will be greater opinion change 
in the uncommitted-control condition than in 
the ED condition. 


Hypotheses for Opinion C, (Third Opinion 
Measure minus First Measure) 


3. The ND condition will show greater 
(total) change than will the LD, This merely 
states that knowledge of the deviate’s ex- 
istence after opinion change will produce 
some regression towards the subject’s initial 
opinion. However, we predict that the prior 
resolution of the opinion discrepancy in the 
LD condition will inhibit or prevent total 
regression to the subject’s former opinion. 
This is stated in Hypotheses 3a and 3b. 

3a. The LD condition will evidence a net 
positive change greater than zero. 

3b. The LD condition will show greater 
net positive change than the uncommitted- 
control condition. 

4, The ED condition will show less net 
change than will the LD condition. This is an 
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important hypothesis for our argument, be- 
cause at the end of the experiment the ED 
and LD conditions will have undergone pre- 
cisely the same experiences, Only the tem- 
poral sequence of these experiences is varied 
(see Method section). T hus this will provide 
a strong test of the stability of the Kiesler 
and Kiesler and Corbin data. 

4a. The ED condition will show less change 
than the uncommitted-control condition. This 
is based on the assumption that having 
gained an ally the ED subject will resist 
later influence attempts. The three reasons 
for the ED subject’s initial resistance to in- 
fluence should still be effective. 

5. This hypothesis is concerned with rela- 
tive liking for the deviate. Because the devi- 
ate in effect resolves a conflict for ED, but 
challenges the resolution of one for LD, we 
hypothesize that ED subjects will find the 
deviate more attractive than will LD subjects. 


METHOD 
Subjects 


Two hundred and fifteen high-school boys volun- 
teered to take part in five- and six-man discussion 
groups. Seventeen subjects were discarded from the 
analysis (7 were suspicious, 5 could not read, and 5 
could not return for later sessions), leaving a net 
sample of 198. The subjects were recruited (news- 
paper advertisements, solicitation in record shops, 
etc.) and did not know each other. Each subject 
was paid $2 for his time. 


Overview 


By manipulation, all subjects were led to believe 
that they were only marginally accepted by the 
group and that they disagreed with the group on an 
important issue; the extent of disagreement was 
identical for all subjects. All subjects had their 
opinions tested on three occasions. The first opinion 
measure was used to compute a bogus group opinion; 
the difference between the first and each of the 
last two provided the two dependent measures of 
opinion change. Each subject received two bogus 
messages from other subjects in the group, one just 
prior to the second opinion measure and one just 
prior to the third opinion measure. The difference 
in content of these messages provided the variation 
in experimental conditions. For the ND condition 
(as well as the uncommitted-control), the first 
message received supported the group opinion an 
the second commented on an irrelevant topic. For 
the LD condition, the first message supported the 
group opinion, the second supported the subjects 
initial opinion. The subjects in the ED condition 
received the deviate message (supporting the sub- 
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ject’s initial opinion) first, and the group-support 28 
message second. The dependent variable is opinion aS 
change. E Ea wo ou v 
ees 
as 
Procedure = 
The procedure is very similar to that of Kiesler F 8 
and Kiesler and Corbin (see Table 1). Each ses- $3 |b M Ww M 
sion consisted of five or six subjects and two ex- ae 
perimenters. The subjects were told that the experi- "UH 
menters worked for the (fictional) *American Insti- E a 
tute for Small Group Research” and that the Insti- Sg 8 2 8g 8 
tute was studying how groups of strangers work 8g 2 & 5 E 
out certain tasks, and how they go about solving ze |V Z o E 
problems and arriving at solutions. However, since á E 4 & H 
the Institute was interested in the process of prob- A 
lem solving, the experimenters would have to inter- Žu aD 
rupt the discussion several times at first. PEEK MoM ob i“ 
Each subject was told he would have to come back R Sess 
for three other sessions, for which he would be paid ü gr 
$1.50 per hour. In addition, the group could win = 
prizes for “efficiency.” It was emphasized that the H EE 
important thing was how the group went about solv- 2 EE Wow M x 
ing the problem, not whether they agreed on a solu- Z ES 
tion. E. E 3 8 
a " a 
Commitment Manipulation 3s is v' [sss UE 
3 
All subjects were led to believe they had to come B E pon to 5o m E] 
back for three additional sessions. However, the $| Be? &égs & 
subjects in the three experimental conditions were á m 
told that they would continue in the same group By 
for the whole time. The uncommitted-control sub- Sa LEE 
jects were told they could change groups if they H B EEFI x KK il 
wished and “might be changed anyway,” but that E $us s 
it would not affect their chances of winning prizes. E $ 8 
The committed subjects, then, anticipated interact- z $4 3 
ing with the same group over several sessions. The E EERI Moo " E 
uncommitted subjects felt they had to participate in z gg 8 
Several group sessions, but not necessarily with the £ EL 8 
same group. E E 8 
El o 
First Opinion Measure E $8 4 MM xE 
a 
The topic of discussion was: “What sorts of B E 4 
qualities do boys most like to see in girls?” This E F a 
was picked as a topic because of the inherent in- 8 3 $ 
terest to high-school boys, and because the topic 3 "M" M * 
Would probably represent opinions not firmly 2 E 
anchored in extraexperimental norms. Eight such Š 3 
qualities (taken from Taylor, 1938) were shown on m $ 
a chart (eg, intelligence, nice looking, good lis- EF 
tener). The subjects were told that the task for 9 |M M M  o]|&$ 
the group was to decide on the relative importance ag 3 
of the qualities; that is, to decide in what order E 
they should be ranked. The subjects were asked to 5 
give reasons why each quality might be important 3 
and the experimenter asked each subject in turn. 4 o 3 i 
This was done for two reasons: to make certain E 23 E 3 A 
that each subject was participating actively and to 3 $ $ 8 Ep a 
provide some basis for interpersonal ratings later. 8 aE NU i Ba g 
The subjects were warned that they should not o 8 8 28 E 
Say anything reflecting on the relative importance ^4 à mBHes 


of the qualities yet, only the absolute importance. 


462 


Each subject then rated his liking for each of the 
other members of the group (see below). The sub- 
jects then individually and privately ranked the 
importance of the qualities. They were told other 
“polls” would be taken later, but that this first 
one would subsequently be used as a basis for 
group discussion. One experimenter then began to 
compile a group consensus for each subject (the 
“consensus” was to exclude the subject’s own opin- 
ion). 


Manipulation of Attraction to the Group 


Previous studies (Kiesler, 1963; Kiesler & Cor- 
bin, 1965) have shown that attraction to the group 
may be reliably varied by manipulating acceptance 
by the group, This method was used here as well. 

Each subject filled out two 11-point scales for 
each other subject, rating liking and potential 
contribution to the group. The group had pre- 
viously been told that if anyone received a con- 
sistently low rating, his performance would not be 
taken into consideration when the group was rated. 
When the experimenter collected these scales, he 
shuffled through them and announced that no one 
had been rejected. He then threw them on a desk. 
The experimenter asked the subjects not to talk 
while he was compiling the group opinions. As an 
afterthought, he suggested that perhaps the subjects 
would like to see the ratings others had given them. 
He passed out bogus ratings and announced that 
the average rating was between "like a little" and 
“ike a lot" and between “contribute a little" and 
"contribute a lot." In fact, the previously prepared 
bogus ratings indicated that each subject was con- 
siderably below this average. The bogus liking and 
contribution scales averaged 2.40 on the 11-point 
Scale. All subjects, regardless of condition, received 
this low-acceptance manipulation. 


Group Consensus 


Each subject was then given a bogus group con- 
sensus, presumably representing the average of the 
opinions of others in the group, excluding himself. 
In fact, ranking of the qualities was based upon the 
subjects ranking and was systematically manipu- 
lated in the following way. Regardless of how the 
subject initially ranked the stimuli, the "group's 
opinion" ranked first whatever the subject had 
ranked sixth; the subjects Ranks 1-5 were moved 
down one rank each; and Ranks 7 and 8 remained 
the same. This produced a constant discrepancy in 
opinion between the subject and the group regard- 
less of the subject's initial opinion. The dependent 
variable for opinion change was the extent to which 
the subject moved his sixth choice upwards (towards 
the bogus group opinion) on the two subsequent 
private testings. 


Messages 


Each subject was then told he could write a 
message to anyone else in the group but that talking 
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would be discouraged for a "little while.” Each sub- 
ject was given a form for his message, handed it to 
the experimenter (whether or not he completed it), 
and the experimenter presumably passed it on to the 
proper person. 

These messages were systematically manipulated 
and were the basis for the difference among experi- 
mental conditions. Three messages were used: one of 
irrelevant content, one supporting the group, and 
one supporting the subject's initial opinion. The 
irrelevant message read, "This note writing is sure 
a different way to discuss a problem." The pro- 
group message read "I agree almost completely with 
the group. I think that what they rank at the top 
really belongs there." The pro-subject or anti-group 
message read "I just can't go along with the group 
all the way. For example, I think that what the 
group ranked as number one should definitely be 
closer to the bottom." The order of these messages 
is discussed below. 


Irrelevant Information 


Each subject then was handed a typed sheet of 
comments about each quality, presumably made by 
the subjects in other groups. There were two vari- 
ations of this list. Both were irrelevant in the sense 
that they essentially said all the qualities were im- 
portant, but did not hint at relative importance. 
They were included to provide an additional excuse 
for the subsequent private rerankings. 


Private Reranking 
The experimenter then announced, 


Now that you have had a chance to learn what 
others in and outside the group have felt, the 
Institute wants to know your opinions about 
these qualities, So we will give you an importance 
ranking to fill out for them. This ranking is com- 
pletely private—it is only for the Institute and 
the group will never see it. I repeat, the group 
will never see this ranking. Jim [experimenters] 
will put them in the envelope [here experimenters 
holds up a large stamped envelope, addressed to 
American Institute of Small Group Research, 
Washington, D. C.] and we will send it directly 
to Dr. Andrews at the Institute. 


Thus, each subject reranked the qualities, assured 
that other subjects (and by implication, experi- 
menters) would not see their rerankings. 

Each subject subsequently wrote and received 
another message, received the second bit of irrele- 
vant information, and completed a second private 
ranking. 


Experimental Conditions 


Table 1 shows the sequence of events and the 
difference among experimental conditions. As indi- 
cated there, the difference among the experimen 
conditions is provided by the sequences of the 
bogus messages received. 


See 
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The ND condition receives the pro-group mes- 
sage first, privately reranks the qualities, then re- 
ceives the irrelevant message, and ranks the quali- 
ties for the third time. 

The uncommitted-control condition is identical to 
the ND condition, except for the variation in com- 
mitment to continue in the same group. 

The ED condition receives an anti-group (pro- 
subject) message first, reranks, receives the pro- 
group message and reranks.? " 

The LD condition receives the pro-group message 
first, reranks, receives the anti-group (pro-subject) 
message and then reranks.? It should be empha- 
sized that at the time of the second opinion measure, 
the ND and LD conditions were identical in pro- 
cedure. At that time, one may consider these two 
conditions one group of subjects that subsequently 
“branch off” procedurally. j 

Final scales and termination. Each subject com- 
pleted five 13-point a priori scales (with the end 
points only labeled) asking: the extent to which the 
subject felt he had to return to later sessions; the 
extent to which the subject felt he had to stay in 
his present group; the extent to which he felt others 
in the group liked him; the extent to which he 
liked others in the group; and the extent to which 
he agreed with the rest of the group on the rankings 
of the qualities. He then wrote down the identifica- 
tion numbers of other subjects with whom he felt: 
he agreed, disagreed, liked best, liked least. 

The subjects then wrote down what they thought 
was the real purpose of the study. The experimenters 
completely explained the experiment with special 
reference to the attraction manipulation, swore the 
subjects to secrecy, and dismissed them. 


RESULTS 
Effectiveness of the Manipulations 


The commitment manipulation is of para- 
mount importance here. On the question 
which asked the extent to which the subject 
felt he had to stay with his present group, 
experimental subjects were significantly more 
affirmative in their response than control 


2 There were actually two variations of the ED 
condition and three variations of the LD condition. 
In the second ED variation, the deviate sent the 
Second message as well, indicating that he had come 
to think that the group was correct. In one of the 
other LD variations, the subject was asked to write 
on his opinion form that it reflected his true opin- 
ion and sign it, The second LD variation was simi- 
lar, except that after signing it was suggested (pre- 
Sumably by another subject) that the subjects 
should be able to see each others’ form “to see if 
anyone changed their opinion.” None of these 
Variations produced any differences whatsoever with 
the respective basic conditions and hence will not 
be discussed further. They do however account for 
the disparity in cell Ns in the Results section. 
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subjects (¢= 8.62, df= 196, p < .0001).* 
We may conclude that the commitment ma- 
nipulafion was effective. As might be ex- 
pected, there is no significant difference be- 
tween experimental and control conditions on 
the question which asked the extent to which 
the subject felt he had to return for later ses- 
sions (£ = 1.32). 

The effectiveness of the acceptance manip- 
ulation is more difficult to check. All subjects 
received a low-acceptance manipulation and 
there is no group with which to compare these 
subjects. However, the overall mean of sub- 
jects on the question of *how much do others 
in the group like you" was 4.56 on a 13- 
point scale. This is significantly lower than 
the midpoint of the scale (¢ = 13.79, df= 
197, p < .0001) and the confidence interval 
indicates that » < 4.85. We may at least say 
that the subjects perceived others' liking for 
them to be quite negative. It appears our 
acceptance manipulation was effective also. 

We might note the results for the item that 
asked the extent to, which the subject felt he 
agreed with the group. As might be expected, 
the subjects who had no knowledge of a devi- 
ate, that is, those in the ND and control con- 
ditions, perceived greater agreement than 
subjects who were aware of a deviate some- 
time in the experiment, that is, subjects in 
the LD and ED conditions (t= 2.17, df= 
196, p < .05). LD subjects tended to indicate 
slightly more agreement than ED, but this 
difference did not approach significance (t = 
1.12). Last, there were no differences among 
conditions on the question which asked how 
much the subject liked the group. 


Opinion Change 


Opinion change for each subject was com- 
puted by ascertaining the number of ranks 
that he moved his sixth choice upwards. 
Scores, then, may range from —2 to +5. 
Table 2 presents the opinion change data. C; 
indicates the change present on the second 
opinion measure; C» indicates the total change 
present on the third opinion measure. Cp — C; 
indicates the change after the second measure. 
Figure 1 shows these results graphically. In 


3 All £ and CR tests are two-tailed, unless other- 
wise noted. 
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Fic. 1. Opinion change as a function of experi- 
mental condition for Cı and Cz. (ED appears before 
Cı; LD, after Cı. See text.) 


Figure 1, means are corrected for the slight 
difference between ND and LD conditions for 
Ci, since at that time there was no pro- 
cedural difference betweeñ these two condi- 
tions. This “correction” is for expositional 
purposes only and does not affect the compu- 
tation of probability levels. Let us look at the 
opinion change data for C; and then move to 
Cs. 

Tests of hypotheses for C4 (second opinion 
measure minus first opinion measure). Our 
first hypothesis stated that the ND condition 
would show greater opinion change than 
would the uncommitted-control. To test this 
hypothesis (as well as all C4 hypotheses in- 
volving ND) we may consider the ND condi- 
tion to be composed of both ND and LD 
(there was no procedural difference between 
them at that time). Hypothesis 1 was clearly 
supported: ND subjects showed significantly 
more opinion change than did uncommitted- 
controls (4 = 2.30, df = 141, p < .05). 

Hypothesis 2 stated that ND subjects 
would show more change than would ED 
subjects. This was also clearly supported 

4The difference between the ND and LD condi- 
tions for Ci is not significant (t = 1.52). The correc- 
tion involves pooling the two conditions for Ci 
(overall, M = 2.01), and adding to that the original 
Ca— Cı scores (+.39 and —.27, respectively). The 


implication of this difference for the interpretation 
of the Ca — Ci scores is discussed later. 
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(t= 5.70, df = 166, p < .0001). Hypothesis 
2a stated that less opinion change would 
occur in the ED condition than in uncom- 
mitted-control condition. This was also sup- 
ported (¢ = 2.37, df = 83, p < .005). 

Thus, all three hypotheses for C; were 
supported. The Kiesler and Corbin data were 
clearly replicated. Also, when the subject 
knew that a deviate-ally existed prior to the 
opinion change measure, the effect was dis- 
sipated completely. The ED mean opinion 
change was significantly less than either the 
ND or uncommitted-control conditions. 

Tests of hypotheses for Ca (third opinion 
measure minus first measure), Hypothesis 3 
stated that the ND condition would show 
greater opinion change than would the LD 
condition, Table 2 shows that these two con- 
ditions did not differ on Cs. However, since 
the mean change for the two conditions was 
slightly different on Cı, perhaps a more 
proper test would be the gain in opinion 
change over the second opinion measure. This 
is represented in Table 2 as C» — Ci, show- 
ing that ND subjects continued to change 
toward the group, while LD subjects showed 
some change back toward their initial opinion. 
The difference between these two change 
scores is significant beyond the .05 level 
(t = 2.37, df = 111). We must conclude that 
the joint effect of the irrelevant message 1n 
the ND condition and the deviant message 
in LD produced a difference in opinion 
change between these two conditions between 
the second opinion measure and the third. 

Regardless of joint effects, however, the 
lack of difference between LD and ND condi- 
tions on the Cs comparison makes the status 
of this hypothesis somewhat unclear. How- 


TABLE 2 


OPINION CHANGE MEANS For C; AND C2 
BY EXPERIMENTAL CONDITION 


Experimental condition 


Uncom- Early 

Sum qwe [ees ars 

(N= 30 A s Oe 
aM 1.13 1.58 2.19 E 
GM 147 197 1.92 : 
CC) + 34 | + .39 | — 27 +.33 


T 
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ever, rejecting the hypothesis would merely 


imply that the Kiesler and Corbin effect is 
more resistant to attack than we had pre- 
viously imagined. Obviously, the lack of un- 
equivocal support for this hypothesis does not 
detract from the present theoretical analysis 
nor the generalizability of our previous data. 

Hypothesis 3a stated that LD opinion 
change would still be significantly different 
from zero. This is a subsidiary hypothesis, 
but clearly supported (¢ = 9.43, df= 79, p 
< .0001). The LD subjects obviously did not 
revert back to their initial position, in spite 
of the fact that someone else in the group 
subsequently agreed with their initial opinion. 

Hypothesis 3b stated that the LD condi- 
tion would show greater opinion change than 
the uncommitted-control. As may be seen 
from Table 2, the means were in the proper 
direction, but not significantly so (7 = 1.12). 
To clarify this point, an internal analysis 
was performed. There was a slight difference 
between the two conditions in the distribu- 
tion of scores on the item that asked if sub- 
jects must return for the later sessions. This 
obviously could have a systematic effect on 
the opinion change scores. Therefore an 
analysis was performed arbitrarily excluding 
the subjects in the LD (N=13) and 
uncommitted-control (W = 9) conditions who 
marked this scale at the midpoint or below. 
The difference between these two conditions 
becomes stronger when the manipulations are 
thus artificially strengthened (now ¢ = 1.87, 
df = 106, p < .10). We may tentatively con- 
clude a tendency in the proper direction. 
Further explication is obviously needed, 
however. 

Hypothesis 4 has the most important im- 
plications for Cs. This hypothesis stated that 
the LD condition would show greater opinion 
change than the ED condition. This hypothe- 
Sis was clearly supported (¢=3.91, df 


‘= 133, p < .001). At this time LD and ED 


Subjects have had identical experiences (more 
Precisely, experimental operations), but in a 
different temporal sequence. The results from 
the questions asking with whom in the group 
the subject agreed and disagreed yield simi- 
lar results. Using as our statistic the dif- 
ference in proportion of subjects indicating 
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agreement with the deviate minus the propor- 
tion disagreeing with the deviate, we find 
significantly greater agreement with the devi- 
ate in the ED condition than the LD (CR 
= 2.29, p< .05). The implications of this 
finding are presented in the Discussion 
section. 

Hypothesis 4a stated that the ED condi- 
tion would show less opinion change than the 
uncommitted-control. The results show a 
marginal tendency in the proper direction 
(t = 1.86, df = 83, p < .10), indicating ten- 
tative support for Hypothesis 4a. We may 
note that the mean numerical difference be- 
tween these two conditions is approximately 
the same for C, and C2, but the C» scores 
are less reliable than Ci. 

Hypothesis 5 was concerned with inter- 
personal attraction. It stated that since the 
deviate in effect resolved a conflict for ED, 
but challenged the resolution of one for LD, 
ED subjects would find the deviate more 
attractive. To test this hypothesis we derived 
a liking change score for each subject for the 
deviate. The subjects were asked at the be- 
ginning of the experiment for their first im- 
pression of the others in the group. They 
filled out the same scales at the end of the 
experiment. Each subject thought he knew 
the identification number of each other 
subject, including the deviate, Each subject 
received a liking change score by a subtrac- 
tion of his first rating of the deviate from 
the second. The change in liking for the 
deviate was +.27 for the ED condition and 
—.33 for the LD condition. These mean 
changes are significantly different (¢ = 1.83, 
df = 125, p < .05, one-tailed), An alterna- 
tive method of testing this hypothesis is to 
utilize the questions asked at the end of the 
experiment, which “person do you like best 
in the group? and which "person do you like 
least in the group.” In the ED condition 58% 
of the subjects said they liked the deviate 
best (13% least); in the LD condition 
14.7% said they liked the deviate best 
(13.396 least). Using as our statistic the 
difference between the proportion liking the 
deviate best minus the proportion liking him 
least for each condition, the difference is very 
significant (CR = 6.33, p < .0001). We may 
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conclude that the deviate was better liked in 
the ED condition than in the LD condition, 
as predicted. 


Discussion 


In general, the data provided strong sup- 
port for the hypotheses. In addition, our pre- 
vious data (Kiesler, 1963; Kiesler & Corbin, 
1965) were supported in two ways. One, the 
comparison between the ND and uncom- 
mitted-control conditions for C; provided a 
strong replication of the Kiesler and Corbin 
data, Two, the fact that this opinion change 
was resistant to subsequent influence (LD 
versus ED comparison for Cs) provides 
strong support for our continuing contention 
that this is not a simple compliance paradigm. 

Tt is interesting to note similar effects for 
compliance settings, however. Asch (1956) 
has previously noted that the presence of a 
deviate greatly reduces the pressure for com- 
pliance in a so-called Asch-type situation. 
Gerard and Rotter (1961) have noted that 
anticipation of future interaction with the 
same group, here referred to as commitment 
to continue with the same group, also en- 
hances conformity in an Asch-type situation. 
The notion that commitment to continue and 
presence of a deviate have similar effects for 
both compliance and private acceptance is 
intriguing and demands further study. How- 
ever, we may note that the Kiesler and 
Corbin results indicated that commitment 
to continue enhanced opinion change only in 
the low-attraction condition, and did not af- 
fect opinion change in the average- and high- 
attraction conditions, This is not what one 
would expect in a compliance setting (Dittes 
& Kelley, 1956), and suggests that the simi- 
larity of effects for compliance and private 
acceptance paradigms may be only superficial. 

It is apparent that disagreement with the 
group combined with commitment to con- 
tinue and low attraction produced a very 
uncomfortable state for the subjects in the 
present experiment. The ED condition points 
this out quite well. In that condition, the 
presence of the ED not only reduced opinion 
change to near zero, but the subsequent in- 
fluence attempt was resisted as well. Appar- 
ently, ED subjects appreciated the presence 
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of the deviate: 58% indicated they liked the 


deviate best in the ED condition, compared 
to 14.7% in the LD condition. In general- 
izing these results, however, the reader should 
keep in mind that the “deviate” in this ex- 


periment agreed with the subject's initial ' 


opinion; he was really a “deviate-ally.” The 


results therefore cannot be generalized to , 


situations where this is not the case as in, 
say, the Schachter (1951) paradigm. 

Theoretically, the effect of commitment is 
to make certain alternative responses (eg, 
devaluation of the group) more difficult and 
less probable (cf. Kiesler & Sakumura, 1966). 
In the present case, it induces pressure for 
the subject to “make his peace” with the 
group. Considering that commitment to con- 
tinue enhances opinion change only for low- 
attracted subjects (Kiesler & Corbin, 1965) 
its effect appears to be that of forcing the 
subjects to deal with uncomfortable situations 
they would rather reject or ignore. If true, 
then commitment to continue with the same 
group (or person) may effect considerable 
cognitive or attitudinal change in situations 
in which previous data and theory have led 
us to expect little change (e.g., current theory 
would predict little or no attitude change in 
any of the conditions in the present experi- 
ment). For example, take the case of extreme 
sanctions with active surveillance. Although 
present theory in social psychology strongly 
suggests there should be little cognitive 
change with extreme sanctions (e.g., Festin- 
ger, 1957; Kelman, 1958, 1961), there are 
a number of contrary examples in social real- 
ity. (The average person in modern Russia 
is perhaps a good contraexample.) The pres 
ent theorizing suggests that barriers against 
rejecting, devaluing, or leaving the group, as 
in commitment to continue, would effect cop 
siderable attitude change for this situation. 
In addition, of course, it may be that we 
have been limited by the degree to which 
sanctions and surveillance may be manipu- 
lated in the laboratory. In any event, com- 
mitment to continue appears to bea powerfu 
variable for interpersonal influence. 
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94 Ss were shown slides of either a Negro or a white young man, who was 
either well dressed or poorly dressed and simultaneously heard a tape-recorded 
statement which was either in favor or opposed to integrated housing and 
which was spoken either in excellent or in ungrammatical English. The stimuli 
formed a 2X2X2X2 factorial design. The evaluation of the stimulus 
person was measured by the evaluative factor of the semantic differential; 
the behavioral intentions of Ss toward the stimulus persons were measured by 
3 factors of the behavioral differential. It was shown that liberal Ss differed 
from illiberal Ss in the relative weights they employed for the characteristics 
race, dress, English, and opinion. Furthermore, English and opinion were the 
determinants of the evaluation judgments, as well as the social acceptance 
judgments on the behavioral differential. Race and English were the important 
determinants of the judgments on the social distance factor. English, race, and 


dress, in that order, were important in 


on the friendship factor. 


Recent studies of social distance have demon- 
strated that it is possible to obtain a number of 
lawful relationships between the characteristics of 
a person and the behavioral intentions and evalua- 
tions of this person by subjects. These judgments 
are largely determined by cultural factors 
(Triandis, 19642; Triandis, Davis, & Takezawa, 
1965; Triandis & Triandis, 1960, 1962). A multi- 
dimensional instrument which measures the be- 
havioral component of attitudes, called the 
behavioral differential (Triandis, 1964b) may be 
used to measure the behavioral intentions toward 
the stimuli, while the semantic differential 
(Osgood, Suci, & Tannenbaum, 1957) may be 
used to measure the evaluation, potency, and 
perceived activity of the stimulus persons. With 
American subjects, race was found to be a power- 
ful determinant of variance in social distance 
judgments; with Greek subjects religion was very 
important; with German and Japanese subjects 
occupation was the important determinant of 
variance. 

The above-mentioned work has relied on ques- 


1The project was supported by Grant G-22878 
from the National Science Foundation, in support 
of a program of Undergraduate Research Participa- 
tion. Levin and Loh conducted the study under the 
direction of the first author, during the summer 
of 1964. 


the determination of the judgments 


tionnaires in which complex stimulus persons, 
such as a “Negro, Portuguese, Roman Catholic, 
Physician,” were judged in connection with social 
distance items forming an equal-interval scale, 
standardized in the culture in which the study 
took place, More recently, similar stimuli were 
used with semantic differential and behavioral 
differential scales. Since stimuli presented as 
verbal descriptions may have limitations, it was 
decided that an approximation of real persons, 
having the abstract characteristics described in 
our questionnaires, would provide a test of the 
generality of previous findings. 


METHOD 
Subjects 


All subjects were Caucasian, native-born Ameri- 
cans attending the 1964 summer school session at the 
University of Illinois. Fifty-six males and 38 females 
were tested. Eighty percent of the subjects perceived 
themselves as belonging to the middle-class, 73% 
came from urban environments, mostly from H- 
linois. Seventy-seven percent were Christian, 676 
Jewish, and 17% considered themselves nonreligious. 
When asked to describe their coloring, 42% checked 
“light,” 52% “medium,” and 6% “dark.” 


Procedure 


The stimuli employed in the experiment consisted 
of color slides of two young men with simultaneous 
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presentation of a tape-recorded voice making a state- 
ment about civil rights. Sixteen combinations of 
stimulus characteristics, generated by a 2 X 2 X 2 X 2 
factorial design, were employed in a fixed order after 
presentation of a warm-up and two anchoring pic- 
tures. The slides showed (a) a Negro or white man, 
who was (b) wearing a suit and carrying an attaché 
case or in overalls and carrying a lunch pail. The 
tape-recorded voices presented a message that was 
(c) either in favor or against integrated housing and 
(d) either in excellent English or in poor English. 
The verbal statements had been previously scaled by 
Davis and Triandis (1965) to represent extreme posi- 
tions on the particular civil rights issue. Examples of 
the statements representing the two extremes in opin- 
ion and the two extremes in grammar are shown be- 
low. The actor who read the statements into the 
tape recorder attempted to employ an appropriate 
Negro or white accent, and a polished or a “poor” 
accent, depending on the combination of treatments 
involved in the particular condition. 

1, The City Council should pass a law prohibiting 
discrimination on the basis of race, religion, or 
ethnic background in any and all housing. 

2. Discrimination in housing is strictly a private 
affair and no action should be taken by the City 
Council or any other government body which would 
interfere with private property rights in any way. 

3. Them guys in City Hall better pass a law that 
ain't gonna let nobody keep anybody out of a 
house they wanna live in. 

4. Them guys in City Hall better not put their 
noses in with no law when a private homeowner 
wants to sell his house only to people like him. 

For each of the stimulus conditions, after the slide 
and tape-recorded voice had been presented, the 
Subjects responded to 15 behavioral differential scales 
taken from the factor analysis of Triandis (1964b). 
In the analysis that follows only three of the factors 
will be employed: the admiration factor, determined 
by the subject judgments on the “would admire the 
character of this person” and “would admire the 
ideas of this person” scales; the social distance factor, 
determined by the judgments on “would exclude 
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from the neighborhood” and “would not accept as a 
close kin by marriage”; and the friendship factor, 
determined from the subject judgments on the scales 
“would accept as an intimate friend” and “would 
eat with this person.” The analysis of the responses 
to two semantic differential scales (good-bad, wise- 
foolish) will also be reported. 

Two of the stimulus conditions were repeated to 
obtain an estimate of the reliability of these judg- 
ments. After the subjects completed responding to 
the stimulus slides and tape recordings, they com- 
pleted a questionnaire in which a number of civil 
rights issues were judged against a set of evaluative 
semantic differential scales. 


Analysis 


A scalogram analysis of the responses of the sub- 
jects to the civil rights issues established five types 
of subjects. Type I is in favor of interracial mar- 
riages, sit-ins and freedom marches, integrated hous- 
ing, integrated schools, and is opposed to segregated 
Schools. Type II is opposed to interracial marriages, 
but is otherwise like Type I. Type III is opposed 
to both interracial marriages and sit-ins, freedom 
marches, etc., but, is in favor of integrated housing 
and integrated schools and against segregated schools, 
Type IV is opposed to interracial marriages, sit-ins 
and freedom marches, and integrated housing, but is 
in favor of integrated schools. Finally, Type V is 
opposed to interracial marriages, sit-ins, freedom 
marches, and integrated housing or schools, and is in 
favor of segregated schools. Thus, the affective re- 
sponses of the subjects form a Guttman scale, with 
interracial marriages as the most extremely favorable 
item, followed by sit-ins and freedom marches, fol- 
lowed by integrated housing, then by integrated 
schools, and finally by opposition to segregated 
schools. The data from these types of subjects were 
analyzed separately and a combined analysis was 
also performed. 

Analyses were carried out separately for each of 
the four response continua employed in the present 
experiment (the admiration, social distance, friend- 
ship, and evaluation factors of the behavioral and 


TABLE 1 


ANALYSIS OF VARIANCE OF COMPLEX STIMULUS PERSONS BASED ON COMPOSITE SCORES FOR 
THREE BEHAVIORAL DIFFERENTIAL AND ONE SEMANTIC DIFFERENTIAL Factors (N = 94) 


Friendship Evaluation Admiration Social Distance 
Source df 
- % var- % var- % var- 
SS F | %ovar-) ss F |%var) ss r |%var-| ss Raver 
Race (A. 1 761| 29.7*| 9.3 576| 0.3 | o2 | 2,352] 0.6 | 0.3 |344951|104.9**| 57.0 
Sie) 1| 5| 43 | 13 | 45296| 213**| 12:8 |106:276| 26.6** 14.0 208| 0.1 0.1 
Dress (C 1| 8464| 77* | 24 | 1,49| 09 | 05 56| 0.001 | 0.01 | 6202! 1.9 14 
gogik D 1 | 277,202 251:5%*| 78.3 |269,980|125.6*| 7$4 |566,256| 141.7%] 74.7 |189,442| 59.3+*| 31.2 
0.2 . j Y 
BXA 0.2 1.9 1.0 0.8 
BXD 1.7 33 24 0.01 
EXD 0:1 0.2 0.03 0:01 
CXA 0.01 0.3 0:8 ri 
AXD 1.2 0.6 0.01 
Š interactions | 11 | 12,124 23,703 43,955 35,137 
45 <.05. 
**5 <.01. 
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TABLE 2 

PERCENTAGE As ‘VARIANCE OF SUBJECT TYPES CLASSIFIED BY POSITION ON THE CIVIL 

RIGHTS Issues As DETERMINED BY SCALOGRAM ANALYSIS 
Social distance Admiration Friendship Evaluation 

Detenis | erat | Moder | Breit | mea | Moder- | Bela | tiveran | Moder- | Rigii | Liber Me | dnd 

IandII| III |IVandV| d. IH |IVandV|IandII| III |IVandV|IandlII| III lv anay 
me | | s lS flat | ok | ws | e be ERR 
mes 2) ¢ | ale) |t] | 211515175 


Note,—Type I = 18, Type II = 31, Type III = 34, Type IV 


semantic differentials). Analyses of variance were 
performed on the means of the responses of the sub- 
jects of a particular type, to the particular stimulus. 


RESULTS 
Reliability 
The reliability of the judgments was obtained 
by repetition of two stimuli. The obtained relia- 
bilities for the various factors ranged between 
Pearson 7’s of .59 and .85. 


Relative Importance of the Characteristics 


Table 1 presents the summaries of the analyses 
of variance based on the sums of the responses 
of the 94 subjects, The data were also broken 
down by experimenter,? by sex of the subject, 
by social class, and by religion of the subject. 
No systematic deviations from the results of 
Table 1 were obtained, An analysis of the re- 
sponses of the six subjects who considered them- 
selves to be dark showed a dramatic difference 
in social distance towards Negroes, as compared 
to the subjects who consider themselves to be 
of light or medium color. The dark subjects em- 
phasized race about eight times less than did the 
others, while they emphasized the level of English 
about twice as much as the others. Because of 
the small number of dark subjects, these results 
must be considered as purely exploratory. 

Table 2 shows the results of analyses of the 
subjects classified into each of the five attitude 
types, as per the Guttman scalogram analysis. 


Discussion 


The results of the present study are consistent 
with previous research. Triandis, Fishbein, and 
Hall (1964) found high correlations between the 
admiration (social acceptance) factor of the be- 
havioral differential and the evaluative factor 


2The experimenters differed in both sex and na- 
tionality. Loh is Chinese. 


= 5, Type V = 6; total N = 94. 


of the semantic differential in person perception. 
The present results confirm this finding by show- 
ing that in the case of both of these factors 
the level of spoken English is by far the most 
important determinant and the belief is the next 
most important. The importance of English as 
a factor in person perception was also established 
in that previous study. Previous work (Triandis 
& Triandis, 1960, 1962) had shown that race 1s 
a most important determinant of the social dis- 
tance factor among American subjects. Again, 
race and level of English are important in the 
present study. Friendship is mainly determined 
by the quality of English, then by race, and 
least by the dress of the stimulus person. 

It is notable that each of the four stimulus 
characteristics employed in the present study had 
some influence in the determination of significant 
amounts of variance in one or another of the 
behavioral differential factors. Thus, the present 
findings confirm previous research which demon- 
strated that different aspects of the behavioral 
intentions of subjects towards stimulus persons 
are determined by different combinations of the 
Characteristics of these stimulus persons. 3 

The results of Table 2 are consistent with 
common-sense expectations. Race was 4 more 
important factor in the determination of the 
social distance of prejudiced than in the determi- 
nation of the social distance judgments of Eee 
prejudiced subjects. In the case of admiration, 
English was the primary determinant for all 
subjects, but the weight given to it differed be- 
tween the subgroups. Prejudiced subjects gave 
some weight to race and belief (opinion). Tn the 
case of friendship we find again the tolerant sub- 
jects giving no weight to race and the prejudice 
giving a substantial weight to that characteristic 
as a determinant of their judgments. 

These and similar previous findings suggest 
that studies of "prejudice" should examine 
problem in greater detail. While race is an € 
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tremely important determinant of social distance 
it is quite unimportant as a determinant of 
admiration, at least for our subjects. Thus, while 
most of our subjects are willing to admire the 
ideas of a qualified Negro, many of the same 
subjects are not willing to accept him in their 
neighborhood. 

To understand prejudice in its full complexity, 
it is necessary to think of a matrix the rows of 
which are defined by various “undesirable” 
characteristics. In addition to those examined in 
the present study, we have asked subjects to 
respond to stimulus persons who differed from 
them in religion, age, sex, nationality, competence 
in doing a job, degree of sociability, etc. We 
presented stimulus persons with physical dis- 
abilities (e.g., deaf), with “a prison record,” 
with different shades of skin color, etc. On the 
column side of this analysis, we examined be- 
haviors such as “would not admire the ideas of,” 
“would not marry,” “would not accept as an 
intimate friend,” “would treat as a subordinate,” 
“would exclude from the neighborhood," and 
“would not hire” (Rickard, Triandis, & Patterson, 
1963; Triandis, 1963). When all of the informa- 
tion collected in these studies is placed together, 
it becomes clear that the classes of behavioral 
intentions mentioned above are distinct, and in- 
fluenced by different combinations of “undesir- 
able” characteristics. “Admire the ideas of” is 
most sensitive to characteristics indicative of 
status—for example, kind of English spoken, 
occupation—and also to the opinions of the 
stimulus persons; marital acceptance is sensitive 


sto English, but also to race and age; friendship 


acceptance is sensitive to age, sex, religion, 
English, race, dress, and skin color; subordina- 
tion is likely when the stimulus person is of a 
high status occupation; exclusion from the 
neighborhood is primarily sensitive to race and 
English, and slightly to religion; acceptance as 
an employee is primarily sensitive to competence 
and disability, and secondarily to race and 
sociability, 

In dealing with this problem it is necessary to 
face the enormously difficult task of separating 
valid from invalid cues. It might be argued 
that when a person speaks ungrammatical English 
this is a valid cue that the college student sub- 
ject would not be likely to experience much 
satisfaction if he were to establish a friendship 
with such a person, (Or is this snobbery?) On 
the other hand, when he sees a Negro and he 
comes to the same conclusion, it might be 
argued that he is making an incorrect inference, 
hence, the designation prejudiced. However, to 
argue that the subject is making an incorrect 


inference implies that the social scientist knows 
the ecological validity (Brunswik, 1947) of the 
cue, in connection with each of the behaviors in 
question. In fact, the scientist usually does not 
know these validities. A new approach to the 
study of prejudice would require the prior study 
of the ecological validities of each of the sup- 
posedly undesirable personal characteristics, for 
each of the classes of behavior under investiga- 
tion. Subsequently, an examination of the weights 
given by a subject to these characteristics of 
stimulus persons may lead to a comparison with 
the ecological validities of the cues. Large dis- 
crepancies between the value of the validity 
coefficients for each cue, and the weights given 
by the subject would lead to the conclusion that 
the subject is prejudiced about a particular 
characteristic in connection with the particular 
class of behaviors. Thus, each cell in the 
matrix described above, would potentially define 
one kind of prejudice. A highly prejudiced person 
would most likely be found to be prejudiced in 
a large number of these cells, while other in- 
dividuals might have only certain “blind spots.” 
To begin such a research effort it would be 
necessary to develop criteria of “effective be- 
havior” and "satisfaction with own behavior,” 
which would permit the determination of the 
ecological validities required by the research 
program. This approach would involve a re- 
orientation of present research on prejudice, but 
it would place such research in the center of the 
fundamental problems of social psychology. 
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LOCUS AND ORIENTATION OF THE PERCEIVER (EGO) 
AND THE ROTATION OF VISUAL IMAGES 


THOMAS NATSOULAS AND JOHN T. MURPHY 


University of California, Davis University of Wisconsin 


A study correlating the performance of 14 college students on a task involving 
tracings on the left side of the head and on a task involving the rotation for 
response of diagrams visually presented. The responses to the tracings on the 
skin occurred under 2 kinds of instruction; Ss had to respond either from an 
external perspective or from an internal one. The responses to the visual figures 
required that S rotate a diagram on different trials, 180* about its vertical 
axis, 180° about its horizontal axis, and 180° in the frontal plane. The results 
showed that responding from an internal perspective was more accurate and 
faster than responding from an external perspective; and that the 3 kinds of 
rotation. differed significantly in accuracy and response time. Accuracy of 
performance under internal instructions was positively and significantly cor- 
related with number of correct responses under all 3 instructions for rotation, 
while low and nonsignificant correlations were found between the latter tasks 


and external instructions on accuracy. 


The present study is concerned with the rela- 
tionship between the ability to take either an 
external locus or to reorient one's internal locus 
for perception of cutaneous-form stimulation, 
and the ability to rotate for response, figures 
visually presented. In the conceptual scheme 
proposed by Natsoulas and Dubanoski (1964), 
the perceiver (ego) is treated as an entity in the 
psychological field (not necessarily in awareness), 
whose locus in the field and whose orientation 
relative to it can change. It is assumed that when 
a figure is traced on the head, perception of it 
from an internal (subjective) perspective may 
necessitate a shift in the angle of orientation of 
the perceiver; perception from an external 
(objective) perspective requires a shift in the 
locus of the perceiver from the usual internal 
one to an external locus. Thus, performance 
under internal-perspective instructions (I instruc- 
tions) should yield a measure of the subject’s 
ability to reorient his perceptual locus; E instruc- 


tions should provide a measure of his ability to 
change his perceptual locus. í 
Since the theoretical concept of the perceiver 
(ego) is depicted as an entity which undergoes 
rotation and relocation within the field, one 
might expect a positive relationship to exist 
between a person's ability to take each perspec- 
tive and his ability to rotate visual images. In 
other words, mental manipulation of the per- 
ceiver and of visual images should be associated. 
In the figure-rotation phase of the experiment 
reported here, the specific task used is one bor- 
rowed from Sato (1960), who asked his subjects 
to draw each of five figures rotated 180 degrees 
in the frontal plane, 90 degrees clockwise Of 
counterclockwise in the frontal plane, and 180 
degrees about its vertical axis (“as though it were 
observed from the backside of the cardboard"). 
As Sato’s description of rotation about the 
vertical axis implies, a change in locus for per- 
ception could be the basis for accurate respond- 
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ing, in which case a positive correlation with 
performance under E instructions would be ex- 
pected. Similarly, if rotation of the visual figure 
in the frontal plane involves a reorientation of 
the perceiver, then a positive correlation should 
occur with performance under I instructions. 
Alternately, suppose that performances on all 
three tasks in the figure-rotation phase were 
highly correlated with performance under I in- 
structions. Such a result would cast doubt on a 
basic assumption of the conceptual scheme, that 
of a fixed image of the stimulus and a flexible 
(reorienting, relocating) perceiver; an interpreta- 
tion of responding under E instructions which 
says the subject perceives from an internal locus 
and then rotates the tracing to give the appro- 
priate response would be supported. 


METHOD 
Subjects 


Fifteen men and women, undergraduates at the 
University of Wisconsin, served individually as sub- 
jects. They volunteered to participate in exchange for 
points on their final examinations in the introductory 
psychology course. Every subject was tested by the 
same male experimenter in the same experimental 
room. 


Procedure 


Upon arrival the subject was seated facing a bare 
wall while the experimenter sat within arm’s reach 
at right angles to the subject facing the left side 
of his head. The time between trials within any 
series was about 5 seconds. The two parts of the 
experimental session were continuous with no pause 
between parts. 

In Part I the following were the critical parts 
of the instructions: 


Now I want you to think of this clear glass 
dish as the side of your head. Suppose I draw 
the lower case letter b on it. This is clearly a b 
from my point of view, and at the moment, from 
yours. But if I turn the dish about, it appears 
as though I have drawn the lower case letter d. 
Similarly, were I to draw a p on your temple, 
it would appear as such from my point of view; 
from yours it would look like a q [the experi- 
menter demonstrated on the glass dish]. 

There are, then two perspectives from which 
you can perceive something drawn on your head: 
an internal perspective or point of view and an 
external perspective or point of view. Therefore 
when I ask you what is being traced on your 
temple, there are two correct answers, depending 
on your perspective or point of view. . - - 

Just before I trace a figure, I will say internal 
or external and thereby indicate the perspective 
from which I want you to draw. The figures are 
very simple, as you will see, but they are not sym- 
metrical and therefore they do appear differently 


. 
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depending on the perspective I indicate I want 
you to take. 


The subject was shown the 12 figures on a piece 
of paper, and then instructed to keep his eyes closed 
throughout this part of the experiment, even when 
he drew the figures on a tablet. The experimenter 
then proceeded to trace in this order' of presentation 
of instructions: I, E, E, I, E, I, I, I, E, I, E, E, 
IST EE EE T DE LOEO E.LOLOESLSE, 
EE I, E, I, I, E, E, l, E, I, I, J, E, E, J, E, 1. On 
a trial “external” or “internal” was called and im- 
mediately the tracing was begun. The experimenter 
began timing as he began to trace and stopped when 
the subject completed his drawing. There were 24 
different tracings in all (see Stimuli) presented twice 
to all subjects in the same order. 


In Part II the subjects were asked to 


draw each diagram as if it were turned upside 
down. That is, as if each diagram were flipped 
over on its head [the experimenter demonstrated 
with the letter p and b and a meaningless figure 
on a slip of paper]. I will also ask you to draw 
the mirror image of the diagrams [the experi- 
menter demonstrated as above]. Finally, I will 
ask you to draw each figure as if it were rotated 
180 degrees in the frontal plane [the experimenter 
demonstrated as before]. 


The experimenter proceeded to show Diagrams 1-5 
(see Stimuli) in that order three times, associated 
with the following instructions: mirror image (MI), 
MI, upside down (UD), rotate 180 degrees (Ro), 
MI, UD, UD, Ro, MI, UD, Ro, Ro, MI, UD, Ro. 
On each trial, the experimenter began timing when 
he showed the diagram and stopped the watch when 
the subject completed his drawing. Again about 5 
seconds were allowed between trials. 

The data of 14 of the subjects are used in the 
analyses of this part of the experiment. One subject 
took so long on the first few diagrams that the 
available time was exhausted. 


Stimuli 


The series of figures for Part I consists of 45- 
degree and 90-degree angles drawn beginning at one 
corner of an imagined square, with vertex at another 
corner, and ending at a third. The 24 tracings used 
exhaust the possibilities for beginning at every corner 
of the square, going to every other corner, and from 
there to one or the other of the remaining corners, 
all in straight lines. The diagrams used in Part II 
are the same ones as shown by Sato (1960) and 
are referred to by number in the same way. 


RESULTS 
Part I 


Responses were scored as correct so long as 
there was an indication the correct perspective 
had been taken. Of all responses to I instructions 
(E instructions) marked correct, 97% (90%) 
were perfectly correct in the sense of not in- 
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volving any other kind of reversal or rotation, 
for example, up-down reversal or 90-degree rota- 
tion with no reversal. Two analyses of variance 
were performed, both of the Subjects X Blocks 
X Instructions type, on the mean numbers of 
responses correct and the mean latencies. Fifteen 
subjects, four blocks of 12 trials each, and 2 
kinds of instructions were involved. 

The two instruction conditions gave different 
mean frequencies of correct responses, 5.0 (out 
of a possible 6.0) under I instructions and 4.5 
correct under E instructions (F=7.08, df 
=1/101, p<.01). There was some tendency 
(ns) for successive blocks to show better per- 
formance (4.5, 4.5, 4.8, 5.0 mean correct re- 
sponses). Instructions produced little differences 
in mean latencies (approaching significance—F 
— 4,34, df = 1/14, p < 10) with I instructions 
yielding slightly faster responses (5.36 seconds) 
.than E instructions (5.72). Over blocks a decline 
in mean latencies was evident with means for 
successive blocks of 6.05, 5.51, 5.40, and 5.19 
seconds (ns). An analysis of latencies for re- 
sponses correct under the more stringent criterion 
(see above percentages of 97 and 90) provided 
a significant (F = 6.91, df = 1/101, $ < .05) dif- 
ference in latencies betweem, responses under I 
instructions (M = 5.27 seconds) and E instruc- 
tions (M = 5.64 seconds). In all of the above 
analyses subjects was a significant main effect 
(p € .01). 


Part II 


The criterion for a correct response is at most 
one line of a diagram showing the wrong kind of 
reorientation. An overall Cochran's Q test for 
the 15 conditions yielded a Q value of 44.26 
(p < .01). Eight additional Q tests, one for each 
diagram and one for each instruction condition 
were performed. Diagrams 1, 3, and 4 differenti- 
ated the three instruction conditions to a sta- 
tistically significant degree; RL instructions ap- 
pear to be the easiest to follow, with UD instruc- 
tions slightly more difficult, and Ro instructions 
providing, in the case of all diagrams, the fewest 
correct responses, None of the three rotation 
tasks differentiated the five diagrams to a sta- 
tistically significant degree. 

An analysis of variance of Subjects X Instruc- 
tions X Diagrams performed on the response 
times yielded no significant interactions and three 
significant main effects (in all cases p <.01). The 
F value for subjects was 6.53, df = 13/190. The 
three instructions showed mean response times 
as follows: UD=29.64, RL= 27.79, Ro 
— 44.30. In this case F= 18.43, df= 2/190. 

Diagrams yielded F= 5.96, df—4/190. The 


BRIEF ARTICLES 


TABLE 1 


INTERCORRELATIONS BETWEEN MEASURES 
rrom Parts I AND I 


Responses correct Response times 


I | UD | RL | Ro I | UD | RL | Ro 
E |.152|.160|.133| .426| -359 .467 | .421 | .498 


I .570 | .838 | .680 .522 | .375 | .326 
UD 482 | .532 .546 | .575 
RL 616 .822 


Note.—p = .05 ifr = .514; = .O1 if r = 641, 


mean seconds for the five diagrams were, respec- 
tively, 34.04, 25.16, 39.70, 40.90, and 29.72. 


Correlations 


Product-moment correlations were obtained 
between the performance measures and time 
measures of the two parts. These are shown in 
Table 1. With respect to the number of items 
correct, the three tasks of Part II are on the 
average significantly, positively intercorrelated 
whereas performance under I instructions is not 
correlated with performance under E instructions. 
Number of responses correct under UD, RL, 
and Ro instructions is positively correlated with 
number of responses correct under I instructions, 
but not correlated significantly with responses cor- 
rect under E instructions. With respect to response 
times, the three tasks of Part II are intercor- 
related positively and significantly, as are UD 
instructions with I instructions. 


DISCUSSION 


It was argued earlier that accuracy of per- 
formance under I instructions would be a meas- 
ure of ability to change perceptual orientation, 
whereas accuracy of performance under E in- 
structions would indicate the subjects ability 
to shift to an external perceptual locus. In the 
present studies these abilities appear to be un- 
correlated. The absence of a significant correla- 
tion between accuracy in responding under E 
instructions and accuracy in Part II of UD and 
Ro rotations is not surprising; it is expected on 
the grounds that the latter two tasks cannot be 
readily conceptualized as involving a change in 
perceptual locus. As for RL rotation, if it did 
involve perceiving the stimulus from an exte 
locus, from behind it, the subjects’ accuracy 
on this task should have been correlated pos 
tively and significantly with number correct. 
under E instructions in Part I. It appears then 
that a more accurate general description of the 
underlying processes involved in the rotation 
tasks may be contained in the phrase “rotation of 
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visual images.” Assuming this to be the case, 
the hypothesis can be eliminated that the same 
process underlies performance under E instruc- 
tions, that is, first perception from an internal 
locus and then right-left reversal. 

In contrast, performance under I instructions 
was well correlated positively with all three 
rotation tasks. It appears that reorientation of 
the perceiver and rotation of visual images de- 
pend on similar underlying processes. There re- 
mains the possibility that all perceptions of trac- 
ings drawn on the head are from an external 
locus, and for purposes of responding from an 
internal locus a right-left reversal is then per- 
formed. This would lead to high positive correla- 


Journal oj Personality and Social Psychology 
1966, Vol. 3, No. 4, 475-479 


475 


tions with visual-rotation tasks, especially RL 
rotation, a result found in the present experi- 
ment. However, if this were the case, accuracy 
of performance and latency of response under E 
instructions would be greater and less, respec- 
tively, than in the case of I instructions. Just 
the reverse was found. 
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INTERACTION OF ANXIETY AND ABILITY IN COMPLEX 
LEARNING SITUATIONS * 


MARTIN KATAHN 
Vanderbilt University e 


2 separate attempts failed to replicate the 1st study in which the interfering 
effects of anxiety in complex serial learning were demonstrated. Learning of the 
serial-verbal maze, however, proved to be significantly correlated with mathe- 
matical aptitude. Compared with low anxiety, high anxiety in combination 
with high aptitude was found to facilitate performance. Furthermore, high 
anxiety in combination with high overall scholastic aptitude was found to 
facilitate academic performance, while high anxiety in combination with low 
and average aptitude was found to interfere with academic performance. 
Other data are referred to which indicate that the academic achievement- 
anxiety-aptitude relationship may vary with the difficulty of the academic task 


and student study habits. 


While Spence and Taylor (Spence, 1958) 
have modified their original drive theory formu- 
lation to include the possible interaction of other 
variables with anxiety, the original statement 
that high anxiety should interfere with perform- 
ance on a complex task is still widely quoted 
(e.g, Baughman & Welsh, 1962; Murphy, 1964). 
The present writer has failed to replicate a num- 
ber of the original studies and has sought to 
discover the responsible factors. Recently, Spiel- 
berger and Weitz (1964) have presented data 
Which indicate that one of the variables likely 
to interact with anxiety is aptitude, Compared 
with low aptitude, high aptitude for a given task 
reflects the fact that task relevant (correct) 

i This research was supported, in part, by Grant 


MH-08784-01 from the National Institute of Men- 
tal Health, United States Public Health Service. 


responses are relatively higher in subjects’ re- 
sponse hierarchies, Task complexity remains a 
major factor affecting performance, of course, 
since high anxiety may still interfere with learn- 
ing at the highest difficulty levels. Nevertheless, 
there are certain tasks with a priori validity as 
moderately complex learning situations in which 
high aptitude may operate to lower “effective” 
difficulty, thus converting them into “simpler” 
tasks. In such instances high anxiety may facili- 
tate learning. Certain commonly used serial and 
paired-associate lists may fall into this category. 

Reported herein are two separate attempts to 
replicate the first complex serial learning study 
in which the differential effects of anxiety as a 
function of task difficulty were noted (Taylor & 
Spence, 1952). Aptitude was investigated as a 
variable after its correlation with performance 
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TABLE 1 


MEAN TRIALS TO CRITERION AND MEAN ERRORS 
ON THE SERIAL LEARNING TASK 


Group N Trials Errors 
Original study 
HA 20 32.75 157.55 
LA 20 25.12 118.70 
Replication 1 
HA 23 18.78 55.48 
LA 23 16.65 50.00 
Replication 2 
HA 32 13.06 40.22 
LA 32 17.47 56.78 


became evident in the first experiment, In order 
to increase the generality of the findings, cor- 
relations between anxiety, aptitude, and academic 
performance were also examined. A portion of 
Spielberger and Weitz (1964) data suggested 
that high anxiety may facilitate performance for 
high aptitude students and interfere with the 
performance of low and average students. Other 
findings of theirs indicated, however, that this 
effect may depend largely on the difficulty of 
the academic task and student study habits. 
These factors will be more fully discussed later 
in this paper. 


METHOD 
Subjects 


Subjects were selected according to their scores on 
the Taylor Manifest Anxiety scale from introduc- 
tory psychology courses at Vanderbilt University 
during the spring semesters of 1963 and 1964. A total 
of 110 subjects participated, 46 in the first study, 
and 64 in the second. The number of male and 
female subjects in each anxiety group was propor- 
tional. High anxiety (HA) scores ranged from 22 to 
38, and low anxiety (LA) scores ranged from 1 to 
11. Two HA subjects and 2 LA subjects were de- 
leted from aptitude comparisons for lack of scores 
on the Scholastic Aptitude Test (SAT) of the Col- 
lege Entrance Examination Board (1956). One 
additional HA subject and 5 additional LA subjects 
were deleted from grade-point average (GPA) com- 
parisons due to the fact that they were special stu- 
dents for whom GPAs are not computed. GPAs are 
a weighted average of academic performance where 
3 points are credited for each hour of A, 2 for B, 
1 for C, 0 for D and F. (Differences in hours cred- 
ited toward graduation requirements for D and F 
grades are not important for GPAs as they are used 
in this study.) A 2.00 average (honor roll) is ob- 
tained by about 15% of the student body. 


Apparatus and Procedure 


A Lafayette Model 303B memory drum, mounted 
on a table behind a plywood screen, was used for 
presentation of the material. All arrangements cor- 


BRIEF ARTICLES 


responded to that used in the original study. The 
learning task (a verbal maze) consisted of a series 
of 20 choices, typed on continuous white paper, 
each choice consisting of either the word “right” or 
the word “left.” The choices were presented serially 
with a 2-second exposure time and a 2-second in- 
terval between exposures. The subject’s job was to 
learn to anticipate each choice during the blank 
interval before it was exposed. The intertrial in- 
terval was 6 seconds. Criterion was two successive 
errorless trials. 


RESULTS AND DISCUSSION 


Table 1 presents mean trials to criterion and 
mean errors for the two replications and for the 
original study. Differences between anxiety 
groups were not significant in the first replica- 
tion, and significant in the opposite direction in 
the second (for trials, F = 7.89, df = 1/60, p< 
.01; for errors, F = 5.44, p < .025). It is clear 
that the anxiety-learning relationship, as origi- 
nally formulated, did not hold up. Furthermore, 
something seems to have occurred in the present 
study which especially affected the performance 
of the HA subjects, since they demonstrated 
such a large improvement from Replication 1 to 
Replication 2. Gross task aptitude differences 
were suggested by the fact that subjects in the 
replication attempts learned the list, on the 
average, almost twice as quickly as those in the 
original study. Since the population from which 
the present samples were selected had mean 
SAT scores approximately 1 SD above the na- 
tional average, the results of the first study sug- 
gested that aptitudes measured by the SAT and 
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Fic. 1. Mean trials to criterion on the serial 
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performance on the task might be correlated. 
The correlations obtained were highly similar for 
both studies. Therefore, the two samples will be 
treated as one unit for the following ability 
comparisons, 

The Verbal Aptitude test (VAT) and Mathe- 


matical Aptitude test (MAT) of the SAT had 


correlations of —.04 (ms) and —.27 (p<.01), 
respectively, with performance on the task (neg- 
ative correlations will be obtained if higher ap- 
titude subjects take fewer trials to reach cri- 
terion). The significant correlation with the 
MAT may be due to a short-term memory fac- 
tor common to both mathematical ability and 
serial learning (Ellis, 1963). The small insignifi- 


` cant correlation with VAT may be due to the 


fact that verbal serial maze learning has little in 
common with the ability to understand and use 
meaningful verbal relationships. 

It can be seen from Figure 1 that high and 
low anxious subjects with relatively low MAT 
Scores for the present sample performed simi- 
larly, but high anxious subjects of high aptitude 
were superior to comparable low anxious sub- 
jects. The interaction of Anxiety X Aptitude 
was significant (F = 4.94, df = 1/102, p < .05), 
as was the direct comparison of HA and LA high 
aptitude groups (/—2.99, df—52, p<.01). 
"These results demonstrate the facilitating effects 
Of high anxiety on high aptitude, but not inter- 
fering effects of anxiety on low aptitude. Since 
all but five subjects in the entire present sample 
had MAT scores above the national mean, a test 
of the effect of anxiety on low or average apti- 
tude is not feasible with the replication data. 
However, the HA subjects who do have scores 
below the Vanderbilt MAT median of 611 show 
a very slight inferiority to similar aptitude LA 
subjects. 

To illustrate the effects of anxiety on a sample 
composed to a greater extent of low and average 
aptitude subjects, the findings of the original 
Towa study are also plotted in Figure 1. If it is 
assumed that about half the subjects in the Iowa 
Study were below the national aptitude average, 
With the great majority certainly falling within 1 


SD above the mean, then the results of all three 


Studies become accountable on the basis of an 
Anxiety x Aptitude interaction. The interfer- 


- . * The inference with respect to academic aptitude 


is based on communication with the Director of 
Admissions at Towa, present-day aptitude measures 
(since none are available for the period of the 
Original Iowa experiments), and personal com- 
munication with former graduate students in the 
department of psychology at Iowa during the pe- 


- Mod of the original study. Considering as well the 
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ence effect is noted in the Iowa sample which 
contains, to a large extent, subjects of average 
and lower ability. The crossover point appears to 
occur for subjects of slightly superior aptitude 
(between the mean and 1 SD above on the MAT 
in the present study). The facilitation effect is 
noted when aptitude exceeds 1 SD above the 
mean. This interaction of anxiety with aptitude 
may also help account for the difference in per- 
formance between the HA groups themselves in 
Replications 1 and 2 of the present study. It will 
be recalled that the HA group of the second 
sample was significantly superior to the LA group 
and that the difference between HA groups in 
the two replications was even greater (see Table 
1). Mean MAT score of the second sample was 
627, compared with 597 in the first sample, an 
increase of 30 points. 

Another area which appears to be sensitive to 
the interaction of anxiety and aptitude is aca- 
demic performance (Spielberger & Weitz, 1964), 
but here the relationship of these variables is 
exceedingly more complex and difficult to predict 
than it is in serial learning. Whereas, in labora- 
tory tasks the criteria of performance (errors; 
number correct responses, trials to criterion) re- 
main constant across*experiments, grading stand- 
ards may vary from course to course, college to 
college, and within the same institution over 
time. Thus, similar grades may not reflect equiva- 
lent achievement in tasks of comparable diff- 
culty. For the present sample, the relationship of 
anxiety and aptitude (total SAT score) to GPA 
is presented in Figure 2. The arbitrary assign- 
ment of ability levels depicted in the graph re- 
sults in unequal z at corresponding levels of 
aptitude within each anxiety group. However, 
one way of illustrating interaction in such a 
case is through a difference in correlation between 
aptitude and GPA within each anxiety group 
considered separately. A stronger correlation in 
the HA group, compared with the LA group, 
would indicate a difference in trend, which in 
turn would reflect an interaction of anxiety and 
ability. Aptitude and GPA have a correlation of 
.47 in the HA group and .24 in the LA group 
(21.77, p < 10, two-tailed test). While the 
difference between correlations does not quite 
reach an acceptable level of significance, it seems 
suggestive since a similar finding for high and 
low test anxious groups has previously been re- 
ported by Grooms and Endler (1960). 

In the present study, the interaction of apti- 


striking performance difference between the original 
and present samples on the serial task, it seems 
fairly safe to make the aptitude difference assump- 
tion when comparing the two studies. 
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anxiety and total SAT score. 


tude and anxiety is manifest in another interest- 
ing finding: 7 of the 10 highest aptitude HA 
subjects (SAT score above 1300) were making 
Dean’s list (GPA of 2.00 or over) compared with 
only 3 of the 10 highest aptitude LA subjects. 
Furthermore, 6 of the 11 lowest aptitude HA 
subjects (SAT score below 1100) were doing less 
than creditable work (GPA under 1.00) com- 
pared with only 3 of the 11 lowest aptitude LA 
subjects. Considering these frequencies together, 
13 of the 21 HA subjects at the extremes of apti- 
tude were performing at the extremes academi- 
cally, compared with only 6 of 21 LA subjects 
(x?=4.71, df =1, p < .05). The results of this 
study thus illustrate the interfering effects of 
anxiety on low and average ability, and the fa- 
cilitating effects of anxiety on high ability, in 
academic performance. 

While the above findings seem clear-cut, data 
reported by Spielberger and Weitz (1964) pre- 
vent the making of any simple theoretical or 
generalized predictive statement about the in- 
teraction of anxiety and aptitude on academic 
achievement. Spielberger and Weitz obtained 
results similar to those reported here in a sample 
of Duke University students for the years 1954— 
57. For 1959 and 1960, however, anxiety inter- 
fered with performance for average and high 
aptitude students, and in 1960, HA low apti- 
tude students were actually superior to LA stu- 
dents of low aptitude. These bewildering results 
were explained by pointing out that grading 
standards were raised as the aptitude of the 
student body increased during the years of the 
study. Although Speilberger and Weitz state 
that average grades tended to remain constant, 
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their graphs reflect the increased difficulty of the 
academic task, that is, grades for low aptitude 
subjects clearly became lower over the years. 
Thus, difficulty may have reached the point 
where anxiety would interfere even at the high- 
est levels of aptitude. 

The finding that HA low aptitude students 
tended to be somewhat superior to LA low apti- 
tude students in 1960 appeared to result from the 
fact that the particular HA students entering 
Duke had developed superior study habits, com- 
pared with LA low aptitude students. Better 
study habits had apparently enabled the HA 
students to perform well enough in high school 
to warrant admission to Duke. Although it is a 
few years behind Duke in this respect, Vander- 
bilt also has been raising standards for admis- 
sion and for academic grades. This suggests that 
the relationships between anxiety, aptitude, and 
GPA at this university, may, over the next few 
years, follow the pattern obtained by Spiel- 
berger and Weitz. This is being investigated. In 
any event, the interaction of anxiety and apti- 
tude on academic achievement appears to depend 
upon the difficulty of the academic task, and 
upon other modifier variables such as study 
habits. 

Returning now to the Anxiety X Aptitude in- 
teraction in verbal learning tasks, a word seems 
in order to account for a recently reported 
failure to obtain such an interaction (Harleston, 
1963). Harleston's results may be due to the 
fact that his measure of ability, performance on 
meaningful paired associates, was not correlated 
with the criterion task of paired nonsense sylla- 
bles, just as verbal aptitude (VAT) was not cor- 
related with serial learning in the present study 
(the correlation between tasks was not reported). 
Another possibility is that the anxiety groups 
were not comparable in ability to begin with. In 
10 of 12 comparisons regardless of ability level, 
Harleston's HA subjects were inferior to his LA 
subjects. If the high anxious group was not truly 
comparable in the presumably correlated ability, 
high anxiety could not be expected to facilitate 
the performance of the “high ability” subjects. 

The present results, taken together with those 
of Spielberger and Weitz, indicate that aptitude 
is a factor which should be taken into account 
when predictions are made on the basis of anxi- 
ety (drive) and task difficulty. The oft quoted 
statement that anxiety interferes with perform- 
ance on a complex task evidently needs a Te- 
vision which points out that actual or “effective 
task difficulty depends upon subjects’ aptitude for 
that task, for certain ranges of difficulty at least. 
Aptitude and task complexity apparently inter- 
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act to affect the relative position of correct and 
incorrect responses in the response hierarchy, 
thus influencing the speed of learning. 
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SEX AND BIRTH-ORDER DIFFERENCES IN CONFORMITY AS 
A FUNCTION OF NEED AFFILIATION AROUSAL * 


e. 
WILLIAM C. CARRIGAN anp JAMES W. JULIAN 
State University of New York at Buffalo 


Birth-order and sex differences in conforming behavior were examined for 96 
6th-grade children under neutral and socially threatening conditions. Need 
Affiliation was aroused by having Ss 1st rate fellow classmates on a brief 
sociometric questionnaire, They then completed the conformity task which was 
to pick appropriate story descriptions for a set of pictures for which popular 
story choices had previously been indicated. As hypothesized, 1st-born or only 
children were more influencible than later-born children, and females more 
than males. These differences increased under conditions of heightened affiliative 
arousal Interactions between sex and treatment condition also emerged, 
suggesting a need for greater caution in generalizing about personality dif- 


ferences in influencibility. 


Several years have passed since Sears (1950) 
Proposed the treatment of ordinal position in the 
family as an important psychological variable. 
Although such a view had long before been 
identified in the writings of Alfred Adler (Ans- 
bacher & Ansbacher, 1956), little empirical work 
had been reported until recent years. Schachter's 
(1959) book dealing with several facets of 
affiliative behavior stands as a major stimulus 
to much of this current work. 

Schachter (1959) argued for and presented 


1 Portions of this paper were read at the Mid- 
Western Psychological Association, Chicago, May 
1965. The authors wish to thank Thomas J. Ahern, 
Assistant Principal, Williamsville, New York, school 
system for his cooperation and help. 


evidence to demonstrate that dependency, that 
is, seeking to be with others, is related to birth 
order, Indeed, at least under threatening or 
anxiety-arousing conditions, persons who were 
first-born or only children appeared more de- 
pendent than individuals who had older siblings. 
Schachter defined dependency as “the extent to 
which the individual uses or relies on other 
persons as sources of approval, support, help, 
and reference [p. 821." This characterization of 
dependency prompted the additional hypothesis 
that birth-order position is strongly related to 
susceptibility to social influence. Schachter cited 
support for this proposition from the work of 
Ehrlich (1958), who found marginally significant 
birth-order differences for college males in their 
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tendency to shift opinions in the direction of a 
presumed group norm. Subsequent studies have 
generally confirmed that persons who were first- 
born or only children appear more susceptible 
to social pressures than do their later-born 
counterparts (Becker & Carroll, 1962; Becker, 
Lerner, & Cárroll, 1964; Sampson, 1962). 

The dominant hypothesis accounting for the 
observed birth-order differences in influencibility 
points to distinctive child-rearing experiences 
which presumably give rise to higher affliative 
needs for first-born and only children. The mo- 
tivational basis for behavioral differences in 
dependency were investigated by both Staples 
and Walters (1961) and Dember (1964) who 
reported birth-order differences in the affiliative 
imagery of stories produced by subjects describ- 
ing TAT pictures under neutral, nonstressful 
conditions. First-born or only subjects produced 
significantly greater affiliative content in their 
stories, Their results supported the proposition 
that birth-order differences in dependency rest 
on a differential motivational base which pre- 
sumably gives rise to behavioral differences when- 
ever the situation is relevant to the afüliative 
needs of the subjects. 

Earlier studies by Schackter (1959), Wrights- 
man (1960), and Staples and Walters (1961), 
pointing to birth-order differences, found it not 
only necessary to construct a situation in which 
dependency needs could be realized, but also in- 
troduced physical threat as a situational stress. 
Threat of physical pain in these studies was 
assumed to result in differential anxiety arousal 
associated with birth-order differences. The find- 
ings generally support the Walters-Karal (1960) 
hypothesis, which postulated anxiety as the im- 
portant motivating factor determining afüliative 
behavior. However, much recent work clearly 
demonstrates birth-order differences under a 
variety of conditions. These include the neutral 
conditions used by Staples and Walters (1961) 
and Dember (1964) cited above. In the study 
here reported, heightened affiliative arousal is 
induced by a social rather than a physical threat. 
Appropriate conditions were suggested by the 
work of Atkinson, Heyns, and Veroff (1954), 
who found that mutual sociometric ratings by 
group members effectively increased afüliative 
content in subjects’ TAT stories. Completing 
the sociometric ratings was interpreted as intro- 
ducing the threat of potential withdrawal and 
rejection, and hence, producing higher affiliative 
needs. Thus, the major hypothesis of the present 
investigation predicts significant birth-order dif- 
ferences in influencibility under conditions of 
potential sociometric rejection with . first-born 
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or only subjects being more susceptible to 
influence. 

The differential stability of passive and de- 
pendent behaviors found by Kagan and Moss 
(1960) for males and females from childhood 
to adulthood suggests the importance of sex as 
a variable in the study of influencibility. Their 
findings also draw attention to the importance 
of extending earlier work to younger age groups. 
To the extent that early child-rearing experiences 
account for the observed birth-order and sex 
differences in influencibility, we can anticipate 
the corroboration of these differences using a 
preadolescent age group. 

A second factor demands the inclusion of 
both sexes in the study of birth order and 
influencibility. Although the findings of Ehrlich 
(1958), Becker and Carroll (1962), Sampson 
(1962), and Becker et al. (1964) have con- 
sistently shown first-born or only males to be 
more influencible, the pattern for females is 
less clear. Whereas Staples and Walters (1961) 
reported that first-born or only females were 
more influencible, Sampson (1962) found that 
first-born or only females were less susceptible 
to influence. The fact that none of these in- 
vestigations has included both sexes within a 
single design emphasizes the importance of in- 
cluding sex as a major variable. 


METHOD 
Subjects and Design 


Ninety-six sixth-grade students from a suburban 
Buffalo grammar school served as subjects. This 
number was broken down by sex and birth order into 
four groups: first-born or only boys, first-born or 
only girls, later-born boys, and later-born girls. Sub- 
jects from each of these groups were run in one of 
three conditions. Six subjects were randomly deleted 
from the analysis in order to equalize the number 
in each cell of the design. 


Procedure 


Subjects were tested in groups of approximately 
35 members. They were shown 10 selected TAT 
cards projected on a screen and asked to “pick the 
story which best fits what is happening in the 
picture.” Four different stories were presented for 
each of the 10 cards. Each subject had a CODY 
of the stories and simply circled the number of the 
story he chose. Following this judgmental task, 
all subjects completed a questionnaire comprised 
of the 28 need Affiliation items of the Edwards 
Personal Preference Schedule (EPPS). At the con- 
clusion of the questionnaire each subject answeret 
questions concerning family sibling structure ant 
birth order. All subjects were run during a single 
school day to minimize communication among 
groups. Insofar as possible, the four alternative 
TAT stories were equated in terms of length (80-100 
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. TABLE 1 


MEAN NUMBER OF KEYED Story CHOICES UNDER THE 
E No-Conrormity NEUTRAL CONTROL CONDITION 
Male Female 
First born 2.62 2.85 
Later born 3.12 3.50 


= ` words each), vocabulary level, subject matter, and 
need configuration, 


Experimental Conditions 


As noted, three conditions were implemented. 
* A neutral conformity condition was created by 
designating popular story choices in instructions to 
the subjects. This treatment took the form of telling 
subjects while a particular picture was being shown 
that “last year’s class from this school picked 
as the most fitting story.” Frequent choice of those 
stories presumably picked by last year’s class was 
taken as indicative of a tendency to conform. The 
normative stories were in fact selected at random. 

A neutral no-conformity control condition was 
also evaluated. For this group no information was 
given as to popular story choices. Responses of 
this group were used to check the extent to which 
randomly keyed “popular stories” were differentially 
popular for the different sexes and birth orders. 
Table 1 presents the average selection of the keyed 
stories by members of this group. As anticipated, the 
analysis of variance of these scores revealed no 
significant differences. 

The major experimental treatment, the “potential 
sociometric rejection” (PSR) condition, required all 
subjects to complete a sociometric questionnaire 
concerning fellow classmates just prior to the picture 
judgment task. Questionnaire items asked for friend- 
ship nominations. This procedure was designed to 
increase the salience of afüliative needs by implying 
possible rejection by others. The PSR condition was 
identical to the neutral condition with the exception 
of this sociometric questionnaire. 


RESULTs AND DISCUSSION 


_ Family birth order, conceived as a psycholog- 
ical variable, has consistently related to de- 
pendency behavior. The present study extended 
this relationship to sixth-grade children and 
investigated the interaction of sex, birth order, 
and situational arousal in determining conform- 
ing behavior. Consistent with previous findings, 
we anticipated both sex and birth-order differ- 
ences in conforming behavior, with girls more 
conforming than boys and first-born or only 
children more conforming than later-born child- 
ren. In addition, we hypothesized that conditions 
of potential sociometric rejection would heighten 
the birth-order differences in levels of conformity. 
Tables 2 and 3 present the data relevant to 
these hypotheses. Overall differences in con- 
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TABLE 2 
MEAN NUMBER OF CONFORMING Story CHOICES 


Neutral conformity Potential sociometric 
B control condition rejection condition 
Male | Female | "Male | Female 
First born | 475 5.00 5.76 6.75 
3.75 4.13 3.50 6.00 


Later born 


formity between conditions were highly signifi- 
cant, thus confirming the efficacy of the socio- 
metric procedure in producing greater conform- 
ing behavior. Further, overall differences in 
average conformity between sexes and birth 
orders were significant and in the predicted 
direction. We note, however, that consistent with 
the hypothesis, the observed significant birth- 
order differences are primarily the result of 
differences observed under the PSR condition. 
This comparison (M —6.06 versus 4.75) was 
significant by the Duncan multiple range test ?; 
the comparable comparison under neutral con- 
ditions was nonsignificant. 

The differential responsiveness of the two 
birth orders to the PSR treatment is equally 
striking when the cónformity levels under the 
neutral condition are compared with the PSR 
levels. The difference in average conformity for 
the first-born or only children (4.83 versus 6.06) 
was significant at the .05 level. For the later- 
born children, however, this difference was only 
4.75 minus 3.92, which is nonsignificant. 

Focusing exclusively on the birth-order dif- 
ferences, however, obscures important sex dif- 
ferences in conformity. Inspection of Table 2 
indicates clearly that it was the later-born male 
children who remained essentially unchanged by 
the PSR manipulation. Hence, the birth-order 


2 The Duncan new multiple range test (Edwards, 
1960) was used to make individual comparisons of 
cell means. 


TABLE 3 


SUMMARY ANALYSIS OF VARIANCE OF 
CONFORMING STORY CHOICES 


Source df MS F 
Condition (A) 1i 16.00 10.26 
Birth order (B) 1 20.25 12.98 
Sex (C) 1 20.25 12.98 
AXB 1 56 <1.00 
AXC 1 10.56 6.77* 
BXC 1 1.56 1.00 
AXBXC 1 1.01 <1.00 
Within 56 1.56 

* -05. 
Aut 
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TABLE 4 
MEAN NEED AFFILIATION SCORED FROM THE EPPS. 


Potential sociometric 


Neutral conformity tenti ne 
rejection condition 


control condition 


Male Female Male Female 
Firstborn | 14.50 13.62 12.50 17.87 
Laterborn| 12.87 13.37 15.62 16.25 


differences for males under the PSR condition 
paralleled the earlier findings of Becker and 
Carroll (1962), Sampson (1962), and Becker 
et al. (1964). We note here the probable ap- 
plicability of the Becker et al. cogent interpreta- 
tion of birth-order differences in terms of differ- 
ential responsiveness to normative, as opposed 
to informational, social pressures. The present 
procedure for arousing affiiliative needs (the 
PSR treatment) may indeed have increased in- 
dividual feelings of marginality in the peer group 
without diminishing the attractiveness of group 
membership, and hence, induced greater norma- 
tive pressure to conform (Jackson & Saltzstein, 
1958). 

Given this interpretation of the meaning of 
the PSR treatment, we aré forced to qualify the 
Becker et al. (1964) statement that "the first- or 
later-born person is, or is not, ‘dependent’ as a 
function of which type of influence, normative or 
informational, is operating in a social situation 
Ip. 323]." This generalization appears to apply 
to men only. Present results indicate no birth- 
order differences for women under strong norma- 
tive pressure to conform. It is important to note 
the uniformity in levels of response for girls. 
Although first-born or only girls were slightly 
more conforming, clearly they were not signifi- 
cantly different from later-born girls, nor as 
noted was there a differential response to the 
PSR condition. Thus, the birth-order differences 
in dependency behavior observed by Schachter 
(1959) and Staples and Walters (1961) were 
not replicated here. There were clear situational 
differences between the conditions here employed 
and those used earlier, Present results lead us 
to propose that for women threat of physical 
pain is a necessary cue for the induction of 
differential affiliative arousal related to birth 
order. This does not appear to be the case for 
men. This position is given support by Gerard 
and Rabbie’s (1961) findings that under physical 
threat conditions affiliation and emotional arousal, 
as measured by GSR recording, were related 

significantly for women but not for men. 
A supplementary source of data was also 
available in the form of affiliation scores derived 
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from subject postsession responses to the EPPS 


(see Table 4). Although the usefulness of this 
scale is seriously questioned (Barron, 1959), it 
is pertinent to note that the analysis of variance 
of these scores showed that the overall level of 
affiliation was significantly higher following the 
PSR treatment (P[F > 6.99] <.05, df = 1/56). 
This finding supported our conception of the 
effects of the social threat manipulation. Un- 
fortunately, these affiliation scores are attenuated 
by the fact that they were obtained at the end 
of the experimental session. The interaction be- 
tween sex, birth order, and condition observed 
in Table 4 reflects dispositional differences to 
the effects of both the sociometric rating proce- 
dure and the individual patterns of response on 
the picture judgment task, and hence, does not 
lend itself to precise interpretation. 

The psychological importance of ordinal posi- 
tion in the family was again shown by the 
present results. Sex differences, however, appear 
to interact with birth order in determining in- 
dividual susceptibility to social influence. Present 
results thus demand the extension of research 
to map these interactive differences. It would 
appear that research has moved from a relatively 
simple conception of the birth-order-dependency 
relationship to where it must now take account 
of the interaction of at least sex, birth order, and 
parameters of the situation. 
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EVALUATION OF MESSAGE AND COMMUNICATOR AS A 
FUNCTION OF INVOLVEMENT * 
ALICE H. EAGLY ? 
University of Michigan 
AND MELVIN MANIS € 
University of Michigan and Veterans Administration Hospital, Ann Arbor 


This study investigated the effects of ego involvement (a) on the individual's 
evaluation of a persuasive message, and (b) on his evaluation of the message 
source (or communicator). Variations in involvement weré produced by 
having Ss of both sexes respond to 2 messages: one message was constructed to 


be relatively involving for male Ss 


but not for females, and the other was 


constructed to be relatively involving for female Ss but not for males. The 
results indicated that involved Ss were more negative in their evaluations of 
messages and communicators than were noninvolved Ss. 


Psychological involvement depends upon the 
importance of a given issue or situation for the 
individual’s self-concept. Involving issues are 
typically associated with attitudes that play a 
prominent role in determining the individual’s 
conception of himself, while noninvolving issues 
are only peripherally related to the self. The 
arousal of self-defining attitudes in relation to 
a persuasive message is expected to affect the 
recipient’s evaluation of the message and of its 
source. 

Sherif and Hovland (1961) have hypothesized 
that involvement in a social issue typically pro- 
duces a restricted latitude of acceptance (the 
Tange of positions considered acceptable) and an 
extended latitude of rejection (those positions 


1This study was conducted by Alice H. Eagly as 
a graduate student under the supervision of Melvin 
Manis, The authors wish to thank Herbert C. 


- Kelman for his comments on an earlier version of 


the manuscript. 

?Now at Michigan State University. 

All statements are those of the authors and do 
not necessarily represent the opinions or policy of 
the Veterans Administrátion. 


considered objectionable). They also cite data 
suggesting that the width of these latitudes may 
have an important effect upon the individual's 
response to persuasive communications. Location 
of a communication within the latitude of ac- 
ceptance is said to affect response in such a way 
that a communication is evaluated as fair and 
unbiased, while location within the latitude of 
rejection produces an evaluation of the com- 
munication as unfair and biased. On any given 
issue, then, uninvolved people should exhibit a 
positive evaluation more frequently than those 
who are involved, since lack of involvement is 
presumably accompanied by an extended latitude 
of acceptance and incoming messages would have 
an enhanced likelihood of falling within the 
acceptable range of the attitude continuum. 

A similar prediction may be derived from an 
alternative point of view that focuses upon the 
recipient’s attempts to minimize opinion change, 
particularly in areas central to his self-concept. 
In the attempt to avoid the disruption of a stable 
self-identity, involved subjects may reject a dis- 
crepant communication as biased and unfair, and 
devaluate the communicator as uninformed and 
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TABLE 1 TABLE 3 aa 
EVALUATION OF MESSAGES EVALUATION OF COMMUNICATORS [ 
Boys' message Girls' message M Boys' message Girls' message M 
High involvement | Low involvement High involvement | Low involvement 
Boys 2.04 2.10 2.07 Boys 245 2.33 24 
Low involvement |]High involvement Low involvement | High involvement 
Girls 2.34 2.10 2.22 Girls 2.35 2.27 2.31 
M 2.19 2.10 M 2.25 2.30 


Note.—n = 62 (in each of the two groups); total N = 124. 


generally unpleasant, while these reactions should 
appear less frequently among the uninvolved. 

The effects of involvement upon the recipient’s 
evaluation of the communicator and his message 
have not been studied systematically. However, 
Schachter’s (1951) study, which could be inter- 
preted as dealing with a similar relationship, 
demonstrated that deviation from group norms 
on a matter of clear relevance to the group goals 
produced greater rejection than did deviation on 
an irrelevant matter. The present study was 
designed to obtain further information concerning 
the effects of ego involvement on evaluation. 


METHOD 
Subjects 


The subjects were 124 ninth-grade junior high 
school students who participated during regular class- 
room hours. All subjects were present at both of the 
two testing sessions. 


Procedure 


The experiment was carried out in two sessions. 
The first session was devoted to an assessment of 
subjects’ attitudes on the topics of communication. 
Opinions toward each topic were assessed by means 
of three Likert-type items, each of which was accom- 


TABLE 2 


ANALYSIS OF VARIANCE AND COVARIANCE: 
EVALUATION OF MESSAGES 


Note.—n = 62 (in each of the two groups); total N = 124. - 


panied by a 6-point scale that ranged from “com- 
pletely agree” to “completely disagree.” Several weeks. 
later, persuasive communications attacking the aver- - 
age subject's views were presented and evaluation | 
was assessed. 3 

Variations in involvement were produced by 
having each subject respond to both a “boys’ mes- 
sage" and a “girls’ message.” The boys’ message was ” 
constructed to be relatively involving for male sub- 
jects but not for females, while the girls’ message 
was to be involving for females but not for males. 
Each subject read both of these communications. 
Half of the subjects read the boys’ message first - 
and then the girls’ message; messages were presented 
in the reverse order for the remaining subjects. 


Messages 


Subjects read two persuasive messages, each about 
350 words long. The messages were purportedly 
written by “students at another school”; the author 
of the girls’ message was described as a "ninth- - 
grade girl" and the author of the boys' message as 
a “ninth-grade boy.” Both messages argued that 
teen-agers should be more strictly controlled by 
adults. More specifically, the boys’ message stated — 
that delinquency among teen-age boys could be 
reduced if parents, teachers, and other adults would 
provide strict rules for boys. In the girls’ message, 
the author contended that mothers should control - 


TABLE 4 
ANALYSIS OF VARIANCE AND COVARIANCE: 
EVALUATION OF COMMUNICATORS 


— 
Source ES E MS| F 
Between subjects 
x (Ay oni 146| 1]|146|3.48 
Subjects within A (B) 51.09 | 122 | 0.42 
Within subjects 
Message (C) 0.49| 1/0,49] 2.91 
AXC 1.31 1.31 | 7.79* 
CXB 20.46 | 121 |0.17 
Adjusted 
Between subjects 
1.32| 1]|1.32|3.68 
B 43.59 | 121 | 0.36 
Within subjects 
C 0.50| 1]0.50|2.96 
AXC 131| 1]|131|7.76* 
CXB 20.44 | 121 | 0.17 


*p 5.01. 


Source ss | of | Ms | F— 
Between subjects 
Sex (A) 0.33 1 [0.33 
Subjects within A (B) 48.78 | 122 | 0.40 j 
Mim ape od A 
essage (C. 0.14 1/0. A 
AXC ? 1.16 1|1.16| 5.48*.- 
CXB 25.94 | 122 |0.21 
Adjusted 
Between subjects 
A 0.31 110.31 
B 46.22 | 121 | 0.38 
Within subjects 
[o 0.26| 1|0.26|1.94 . 
AXC 0.58| 1] 0.58] 2.92 — 
CXB |23:87| 121 |O20| —.— 
*ps.05, 
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à their daughters’ choice of clothes in order to prevent 


unwise selections. Data from the first session indi- 
cated that most of the subjects were mildly nega- 
tive with respect to the views advocated by the 
communicators. 


Measure of Evaluation 


Immediately after reading each message, subjects 
were presented with four multiple-choice questions. 
Two of these questions required an evaluative assess- 
ment of the message: “How fair do you think the 
statement is?” and “How well-written do you think 
the statement is?” Three graded responses were pro- 
vided for each question (eg.° “very fair, fair, and 
unfair"). In each case, the most favorable response 
was scored 3, the intermediate response was scored 2, 
and the most unfavorable response was scored 1. 
Scores from the two questions were combined to 
obtain an overall index of subject’s evaluation of 
the message. 

A similar technique was employed to assess evalua- 
tion of the communicator. The pertinent questions 
here were: “Do you think that the boy [girl] who 
wrote the statement is well-informed about teen-age 
problems?” and “Do you think that the boy [girl] 
who wrote the statement has a good personality?” 
Three graded responses were again provided (eg., 
“very well-informed, fairly well-informed, and not 
well-informed”); favorable, intermediate, and un- 
favorable responses were scored 3, 2, and 1, 
respectively. 


RESULTS AND DISCUSSION 


Table 1 presents the mean evaluations of the 
messages (higher scores indicate more favorable 
evaluations), These data support the prediction 
that messsages challenging the recipient’s views 
are evaluated less favorably by those who are 
involved in the experimental issue than by those 
who are uninvolved (i.e., male subjects respond 
less favorably to the boys’ message than to the 
girls’ message, while the females reverse this 
pattern), The analysis of variance presented in 
Table 2 confirmed the significance of this effect; 
the interaction between subject’s sex and mes- 
sage content was significant at the .01 level. An 
analysis of covariance (Winer, 1962) was also 
performed to guard against the possibility that 
these results were primarily a function of initial 
attitude differences, rather than involvement 
(eg., the boys might have responded relatively 
favorably to the girls’ message because it was 
more compatible with their own initial beliefs). 
The results of the analysis of covariance (pre- 
sented in Table 2) indicated once again a signifi- 


4The experiment also involved attitude change 
and message interpretation as dependent variables, in 
addition to the evaluation measures discussed above. 
Unfortunately, the complexity of the obtained results 
in combination with the sequence in which these 
variables were assessed made it difficult to interpret 
the data unambiguously. . 
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cant interaction between sex and message content. 
The involvement effect, then, is not explainable 
in terms of attitude differences. 

Table 3 presents the mean evaluations of the 
communicators (higher scores indicate more 
favorable evaluations). The data show a tendency 
for the involved subjects to respond less favor- 
ably to the communicator than do the unin- 
volved. The statistical analysis of these data 
appears in Table 4. The interaction between 
subject’s sex and message content was again 
significant when evaluated in an analysis of vari- 
ance. When these data were reevaluated holding 
initial attitude constant in an analysis of covari- 
ance, the Sex X Message interaction was again 
obtained, but this time at a borderline level of 
significance (p < .10).5 

In general, the results support the prediction 
that involved subjects react more negatively than 
the uninvolved when presented with a persuasive 
communication that contradicts their beliefs. In 
the present experiment, this negative reaction 
resulted in an unfavorable response to both the 
communicator and his message. This result is 
especially interesting when viewed in relation to 
the studies that have associated high involve- 
ment in an issue with resistance to attitude 
change (Fine, 1957; Miller, 1965). The possibil- 
ity exists that resistance to change in the high 
involvement condition could be mediated by a 
negative evaluation of the communicator and his 
message. 

5 To assess the possibility that the two evaluation 
variables were essentially measuring the same thing 
(hence the similarity of the obtained results), cor- 
relation coefficients were computed within each ex- 
perimental condition. The mean correlation across 
conditions (calculated from the r to 2’ transforma- 
tion) was .61. This correlation, while substantial, 
does not unequivocally account for the parallel 
findings obtained with the two dependent variables; 
less than 40% of the variance associated with each 
evaluation measure can be accounted for in this 
manner. 
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EFFECTS OF PRISONER'S DILEMMA FORMAT ON 
COOPERATIVE BEHAVIOR 


GARY W. EVANS anp CHARLES M. CRUMBAUGH 


Bureau of Child Research, University of Kansas, and Parsons State Hospital 
and Training Center 


General psychology students played 1 of 2 versions of the Prisoner's Dilemma 


game for 50 trials. One version of the 


game was presented in matrix form and 


the other in nonmatrix form. Ss under the nonmatrix format condition co- 
operated more frequently than Ss under the matrix format condition. This 
finding supported the contention that the high frequency Öf uncooperative 
behavior typically exhibited by college students playing the Prisoner's Dilemma 
game is due, at least in part, to nonstrategic elements of matrix presentation 


of the game. 


Two-person, nonzero-sum games provide ex- 
perimental tasks which may permit investigation 
of unexplored areas of social behavior. The most 
common type of the two-person, nonzero-sum 
games is usualy referred to as Prisoner's 
Dilemma, The essential feature of the game is 
that each person must choose between a coopera- 
tive response which increases the total gain of 
both participants or a noncooperative response 
which maximizes an individual's gain on a given 
trial but reduces the joint Yeturn of both players. 
A cooperative strategy, in addition to increasing 
the joint gain, leads to greater personal gains 
over a series of trials provided mutual coopera- 
tion between the two players exists or can be es- 
tablished. Thus, the elements of the game are 
common to many mixed-motive social situations; 
a person’s short-term interests are in conflict 
with his long-term interests and the consequences 
of one’s behavior depend upon how another 
person responds. 


A. Matrix presentation 


B. Nonmatrix presentation 


Fic. 1. Presentations of the Prisoner’s 
Dilemma game. 


One consistent finding, when subjects play the 
Prisoner’s Dilemma game for a limited number 
of trials and communication is not allowed, is 
a prevalence of noncooperative choices (e.g, 
Bixenstine, Potash, & Wilson, 1963; Bixenstine 
& Wilson, 1963; Deutsch, 1960; McClintock, 
Harrison, Strand, & Gallo, 1963; Minas, Scodel, 
Marlowe, & Rawson, 1960; Scodel, 1962; Scodel, 
Minas, Ratoosh, & Lipetz, 1959). Whether this 
predominantly noncooperative behavior is char- 
acteristic of college students in mixed-motive 
situations or whether it is due to nonstrategic 
elements introduced by a matrix format has not 
been determined. Since the format of the Pris- 
oners Dilemma game is strategically irrelevant, 
alternate experimental tasks could be devised 
which maintain the mathematical properties of 
the game but change the characteristics of matrix 
presentation. If a matrix format does introduce 
an element into the situation which decreases the 
frequency of cooperative behavior, then more 
such responses should be produced by the alter- 
nate formats. If, on the other hand, cooperative 
behavior is a function only of the strategic con- 
siderations, the format should make no difference. 

This experiment investigated the effect of 
Prisoner’s Dilemma format upon the cooperative 
behavior of college students playing the game. 


METHOD 


Subjects were 88 general psychology students a 
Kansas State College of Pittsburg who volunteered 
to participate in a decision-making experiment. , 

The subjects were brought in dyads into 
experimental room and seated on opposite sides © 
a partition which obstructed their view of ea 
other. The experimenter sat in full view of both 
subjects and at a right angle to them. 

Dyads were randomly assigned either to the 
matrix format condition or to the nonmatrix format 
condition, Subjects under the matrix format condi- 
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tion were provided the matrix shown in Figure 1A 
and given the following instructions: 


The purpose of this experiment is to study 
decision making where each person’s decision has 
an effect on the other as well as himself. There 
are two of you who are going to make a series 
of decisions. The choices you make will determine 
how many points you get. At the end of the 
experiment you may trade your points for money. 
You may keep the money you earn. 

Look at the matrix before you. You are Per- 
son I [the experimenter pointed]; you are Person 
II. The first number in the parentheses represents 
Person I’s earnings and the second number repre- 
sents Person II's earning. The points you earn are 
determined by the combined choices of both of 
you. If you choose A, you will earn 3 points or 
no points; if you choose B, you will earn 4 
points or 1 point, depending on what the other 
person does. If you both choose A, you will each 
earn 3 points. If you both choose B, you will 
each earn 1 point. If one chooses A while the other 
chooses B, the one who chooses B will earn 4 
points and the one who chooses A will earn 
nothing. 

When you have made your decision, write A 
or B on the sheet before you. After both of you 
have made your decisions, I will show you how 
many points you earned by writing the earnings 
for that decision beside your choice. You will do 
this 50 times. Are there any questions? 


Subjects under the nonmatrix format condition 
were provided a wooden chip and one of the two 
form boards shown in Figure 1B along with the 
following instructions: 


The purpose of this experiment is to study 
decision making where each person's decision has 
an effect on the other as well as himself. 

There are two of you who are going to make 
a series of decisions. The choices you make will 
determine how many points you get. At the end 
of the experiment you can trade your points for 
money. The more points you earn, the more 
money you will get. You may keep the money 
you earn. 

You must decide whether you want me to give 
you 1 point, or give him [her] 3 points. If you 
want me to give him [her] 3 points, place this 
chip here [the experimenter pointed]. If you want 
me to give you 1 point, place the chip here. 
[This paragraph was read to each subject indi- 
vidually.] 

After both of you have decided what you want 
me to do, I will show you how many points you 
got on that trial by writing it on the piece of 
paper in front of you. You will do this 50 times. 
Do you have any questions? 


Notice that the strategic considerations of the 
form board task are identical to those of the matrix 
task. The case in which both participants make the 
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MEAN NUMBER OF COOPERATIVE CHOICES 


BLOCKS OF TEN TRIALS EACH 


Fic. 2. Mean number of cooperative choices by 10 
trial blocks for matrix and nonmatrix conditions. 


mutually cooperative choices of “Give him [her] 3” 
is equivalent to the mutually cooperative AA com- 
bination of the matrix; the case of mutual "Give 
me 1" is equivalent to the BB combination; and 
the case in which one subject chooses "Give him 
[her] 3" and the other chooses "Give me 1" is 
equivalent to the AB or BA choice combinations of 
the matrix. 

After the instructions were given, each subject was 
asked two questions concerning the payoffs for the 
various choice combinations. If he did not answer 
the questions correctly, the pertinent sections of the 
instructions were repeated. This process was con- 
tinued until each subject could answer two consecu- 
tive questions. Fifty consecutive trials were run and 
points earned were recorded after each trial for 
each subject. Although the instructions and experi- 
mental arrangements created the impression that 
payoffs were based on the combination of choices 
made by the two subjects, subjects were actually 


TABLE 1 
ANALYSIS OF VARIANCE OF COOPERATIVE CHOICES 
Source af SS MS F 
Total 439 |2,987.49 
Between subjects 87 |2,281.66| 26.22 
Format 1 | 23418| 234.18 | 9.84 
Error (A) 86 |2,047.48| 23.81 
Within subjects 352 | 705.83 
Trial blocks 4 20.25] 5.06 | 2.58* 
Linear 1 3.69| 3.69 | 1.88 
Quadratic 1 1.64 1.64 
Cubic 1 1477| 14.77 | 7.54 
Quartic H 15 AS 
Format X Blocks| 4 9.86| 2.46 | 1.96 
Error (B) 344 | 675.72 1.96 
*p«.05. 
5 201. 
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playing against an experimenter-controlled, pre- 
determined “stooge” strategy. The controlled strategy 
was a conditionally cooperative one in which the 
“stooge” chose cooperatively on the first trial and 
then reflected the subject's response on the previous 
trial for Trials 2 through 50 (ie. the experimenter 
chose on Trial N what the subject had chosen on 
"Trial N-1). After the 50 trials, the subjects exchanged 
their points for money on 2 point-to-1 cent ratio. 


RESULTS 


The choice behavior of subjects by format 
condition and blocks of 10 trials are presented 
in Figure 2 in terms of the mean number of 
cooperative (“A” or "give 3”) responses. 
Table 1 includes the results of a mixed design 
analysis of variance which considers the effect 
of game format, change over trial blocks, and 
the interaction between format and trial blocks. 

Inspection of Figure 2 and Table 1 reveals 
that subjects under the nonmatrix format condi- 
tion demonstrated a significantly greater level of 
cooperation than subjects under the matrix 
format condition, Table 1 also shows that the 
trial blocks contain a statistically significant 
trend. Trend analyses performed subsequently 
reveal that the trend over trial blocks can best 
be described as a cubic function. The inter- 
action between format and trial blocks was not 
significant, thus there is no evidence that the 
slopes representing the format conditions were 
approaching the same asymptotic value. 

These findings support the contention that the 
high frequency of uncooperative behavior typi- 
cally exhibited by college students playing the 
Prisoner’s Dilemma game is due, at least in part, 
to nonstrategic elements of matrix presentation 
of the game, Apparently absolute level of co- 
operation is not independent of the context in 
which the conflict situation is presented. 

One of the most striking features of this 
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experiment was the relative ease and speed with 
which the instructions were mastered by subjects 


under the nonmatrix condition. Since the data. 


offer no suggestions concerning which type of 
format should be used in further experimenta- 
tion, the decision can be made on the basis of 
convenience and appropriateness for the popula- 
tion to be studied. For instance, a nonmatrix 


format may be more appropriate than matrix . 


format for such populations as children or the 
mentally retarded. 
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SOME DETERMINANTS OF EMITTED REINFORCING BEHAVIOR: 


LISTENER REINFORCEMENT AND BIRTH ORDER? 


ROBERT L. WEISS? 


Palo Alto, California, Veterans Administration Hospital and Stanford University 


Individuals reinforce one another in almost all interpersonal relationships, 
yet little is known about the determinants of this behavior in everyday life. 
A technique was devised so that listeners instructed to maintain rapport with a 
speaker could reinforce a prerecorded speaker. This study determined whether 
individuals are consistent in their reinforcing behavior, and whether, as 
predicted, there exists a relationship between birth order and reinforcing 
behavior. This hypothesis is based upon knowledge that lst-born children 
show greater responsiveness to affiliative cues than later-born children. The 
results indicated that individuals are in fact very consistent in amount of 
reinforcement emitted, and that initial reinforcing behavior satisfactorily pre- 
dicts subsequent behavior. The hypothesized relationship between birth order 
and amount of reinforcing behavior was also confirmed; 1st-born and only 
children reinforced the speakers more than later-born children. These results 
indicate that there are meaningful individual differences in reinforcing behavior 


which can be studied. 


Emitting reinforcing behaviors to others is an 
aspect of almost all interpersonal relationships, 
yet little is known about the determinants of 
reinforcing behavior in everyday life. The results 
of laboratory studies of social reinforcement 
converge on the theme that generalized rein- 
forcers, in diverse forms, are effective for diverse 
populations (e.g, Bandura & Walters, 1963; 
Greenspoon, 1962; Krasner, 1958, 1962). The 
present paper summarizes one aspect of a pro- 
gram designed to investigate inter- and intraindi- 
vidual consistencies in a class of behavior de- 


, Scribable as "reinforcing behaviors." 


Our systematic knowledge about social rein- 
forcement stems mainly from studies focusing 
On the laboratory behavior of persons when 
reinforced. A few studies (Centers, 1963; Ver- 
planck, 1955) have departed from this tradition 
and qualify as investigations of "reinforcement 
in everyday life." Although dealing with rein- 
forcement contingencies in more natural settings 
these studies remained focused on changes 
brought about by “examiner” reinforcement. 

. The present study had two specific aims for 
investigating reinforcing behavior. First, to deter- 
mine the extent to which individual consistencies 
are manifest in the reinforcing behavior of 


1From the Behavioral Research Laboratory, Palo 
Alto Veterans Administration Hospital. 

2The author wishes to express his appreciation 
to O. B. Nereson for making available a listening 
audience of subjects, and to Jean Aron for technical 
assistance throughout the ‘study. 


listeners in the context of a speaker-listener 
relationship. Second, to establish the social or 
“ecological validity" of emitted reinforcing 
behavior by relating if to an organismic variable 
presumed to mediate affiliative behavior tend- 
encies. 

Since the appearance of Schachter's (1959) 
work relating ordinal birth position to affiliative 
behaviors, relationships have been established 
between personality test measures of affiliative 
motives and birth order (Conners, 1963; Dember, 
1964). Dember found scorable TAT n Affiliation 
themes to be Significantly more frequent among 
first-born (and only) children than among later- 
born children. The relationship was most ap- 
parent for female subjects, a finding consistent 
with Schachter’s results also with female sub- 
jects. In the present study it was hypothesized 
that birth order would be related to amount of 
reinforcing behavior emitted to a speaker when 
the listener’s role was to “maintain rapport” 
with the speaker. Specifically, it was predicted 
that first-born and only children as listeners 
would be more responsive to a speaker than 
later-born children because of the former group’s 
sensitivity to affiliative cues. Thus, in the role 
of listeners first-born (and only) children would 
be more likely to reinforce a speaker. Confirma- 
tion of this predicted realtionship between birth 
order and reinforcing behavior would increase 
the credibility of treating reinforcing behavior 
as a form of emitted behavior having inter- 
personal significance. 
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METHOD 
Subjects 


An entire psychology class, N = 29, enrolled at 
Foothill College served as subjects. There were 17 
males (M age=20.3 years) and 12 females (M — 
22.5 years). There were 18 first-born (and only) 
and 11 later-born children in the group. Testing 
was conducted at the Palo Alto Veterans Administra- 
tion Hospital as part of a class visit to a psychology 
research laboratory. 


Procedure 


The general technique involved a group listening 
session, where subjects were invited to role play the 
listener to two tape-recorded speakers. As listeners, 
subjects were asked to "maintain rapport" with the 
speakers by facilitating or otherwise making it 
"easy for the speaker to talk to you individually, 
as if you two were alone.” Whenever a subject 
wished to make a response which he, the subject, 
defined as “rapport giving" he merely pressed a 
silent button held in his hand for this purpose. 
Thus, each “reinforcement” was translated into 
button presses. The frequency and point at which the 
subject chose to reinforce the speaker was left to 
the subject’s interpretation of the role. The role- 
play set required subjects merely to maintain 
rapport with the speaker. E 

Prerecorded instructions to the listening audience 
were played through the same audio equipment 
used to play the speaker tapes. After each listening 
session the subjects filled out a 50-item adjective 
check list in order to describe their impression of 
each of the two speakers. (There were an equal 
number of positive and negative adjectives.) 


Speakers 


The "speakers" were prerecorded monologues 
made by a senior high school girl, in her own role, 
and a male Stanford junior, in the role of a fresh- 
man Stanford student. The female speaker had been 
instructed to talk about herself in a candid manner, 
emphasizing her likes and dislikes, reactions to 
school, etc., but not to dwell upon personal prob- 
lems. The result was a 12-minute fast-moving ac- 
count of the reactions of a bright, verbal, adolescent 
girl recounting her anticipations about entering 
college, reactions to teachers, dating, etc. The male 
speaker produced a 10-minute tape which, by con- 
trast, was slower moving, more prone to word 
clichés about fraternities ("After all, they are your 
brothers . . .”), difficulties finding dates, and en- 
thusiasm (or disappointment) for instructors. The 
affective value of his content had been intentionally 
balanced between positive and negative state- 
ments. 


3 The male speaker tape was generously provided 
by Albert Bandura, who had it produced for his 
own work on interviewing. The tape was well suited 
to the present purposes because the speaker talked 
candidly as if to only one listener. 
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Apparatus 


By means of a network of silent buttons each sub- 
ject was connected to a pen on either of two 20-pen 
event records, which recorded the frequency and 
location in time of each person’s button-pressing 


activity. One (marker) pen was used to synchronize | 


the content of the speaker tape with the chart 
speed, thus making it possible to determine what 
speaker events were being responded to by the 
listeners. A stereotape recorder was used with all 
verbal content prerecorded on one channel, and a 
S-second time signal prerecorded on the other 
channel, The content channel was played to the 
audience (through additional speakers to insure 
clarity) and the time signal (not otherwise audible) 
was played through an audio relay which activated 
the marker pen of the event recorders, The event 
recorders were housed in a specially prepared sound- 
deadening box located in a room adjacent to the 
group listening room. 


RESULTS 
Individual Consistencies 


The total number of button presses (“rein- 
forcements”) emitted to male and female speak- 
ers were tabulated separately. Large and sig- 
nificant product-moment correlations were ob- 
tained between number of reinforcements emitted 
to both speakers; these r’s are listed in Table 1. 
Subjects were remarkably consistent in the 
amount of reinforcement given to a speaker even 
though the speakers differed in style and con- 
tent. A wide range of individual differences n 
number of reinforcements was noted; listeners 
ranged from giving 1 to 156 responses in ap- 
proximately 20 minutes of listening. à 

The stability of listener reinforcing behavior 
was further assessed by correlating the number 
of reinforcements for the first 2.5 minutes to 
the number emitted during the remaining listening 
time, These r’s, for sex of listener and speaker, 
are presented in the right-hand columns of 
Table 1. From the magnitude of the r's it is 
apparent that number of reinforcements emitte! 
during the first 2.5 minutes of listening ade- 
quately predicts a listener’s output for the next 
8-10 minutes. There is a tendency for predict- 
ability to vary with sex of speaker and listenen 
in that same-sex combinations are more predict- 
able. The differences between these correlation 
are not statistically significant however. 


Birth Order and Reinforcing Behavior 


The relationship between birth order and 
reinforcing responses emitted to the speakers 1 
presented in Table 2. Listeners were clas 
as high or low in relation to the median number 
of reinforcements emitted and according to birth 
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TABLE 1 


‘Tue r’s FOR NUMBER OF REINFORCEMENTS EMITTED 
to MALE AND FEMALE SPEAKERS 


Speaker 
T 
Listener 2.5 minutes versus total» 
Male versus 
females 
3 Male Female 
- Male (n = 17) 88 83 E 
^ Female (n = 12) 90 72 91 


Noter > .70, p =.01. 
^ Correlation between responses emitted to male and female 


tape. 
V initial 2.5 minutes to remaining responses. 


order (first and only versus later born). As 
4 predicted, a significant relationship between birth 
order and emitted reinforcements was obtained; 
first-born (and only) children were more re- 
| sponsive to the speakers. When the responses 
; of the male and female listeners were analyzed 
Separately a relationship to birth order was 
. found for the female listeners but not for the 
- male listeners. Analyses were made of the 
relationship between birth order and number of 
reinforcements emitted to the male and female 
‘speakers separately. Only the chi-square for 
-responses to the male speaker approached sig- 
nificance (y? = 2.86, p=.09). 


Ratings of the Speakers 


The number of positive and negative adjec- 
_ tives used to describe the speakers was compared 
to the number of reinforcements emitted to the 

Speakers. Male and female listener behavior 
.j Was similar to the male speaker: frequency of 

reinforcement was not related to number of 
Positive adjectives checked (r’s=.08 and .07 
for males and females, respectively), and only 
partially related to number of negative adjec- 
— tives checked (rs—.48, p=.05 and .44, ns). 
Reactions to the female speaker were different 
e for the sexes: reinforcements and number of 
- positive adjectives were correlated for the males 
(r=.61, p=.01), but not for the females 
-(rz.15). The r's between reinforcements and 
3 lumber of negative adjectives were insignificant 
( —.23 and .39, males and females, respectively). 
— Individual consistencies were noted for the 
total number of adjectives checked for either 
Speaker (r's—.71 and .71, males and females, 
Tespectively), There was no relationship between 
lumber of adjectives checked and birth order. 
Thus for this sample, birth order was not merely 
à predictor of "responsé style" but rather was 
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TABLE 2 
BIRTH ORDER AND NUMBER OF REINFORCEMENTS 


Listeners 
Number of 
reinforce- All Female Male 
ments 
First | Later | First | Later | First | Later 
born | born | born | born | born | born 
High 13 3 5 1 8 2 
Low 5 8 1 5 4 3 
x? = 3.91, 
p <.05 p = .05* p.05 


* By exact test. 


specific to one aspect of affiliative behavior, viz., 
reinforcing behavior. 


Discussion 


The instructions to maintain rapport with a 
speaker, even when the speaker is seen to be a 
tape recording, makes possible the demonstration 
of regularities in listener reinforcing behavior. 
In spite of the obvious limitations of translating 
reinforcement repertoife into button presses the 
present results support the contention that re- 
inforcing behavior can be studied as a form of 
emitted behavior having significance for two- 
person interactions. Persons differ widely in 
amount of reinforcing behavior emitted, but 
they show individual consistencies. Because of 
these consistencies amount of reinforcing be- 
havior should be readily perceived by others and 
should figure greatly in judgments of “being 
liked,” “being understood,” etc. (see, e.g., Stoler, 
1963, and Waskow, 1963, for possible clinical 
applications). Whether one reinforces another a 
lot or a little is signaled early in the speaker- 
listener relationship, a fact which also suggests 
that individual differences in reinforcing be- 
havior are clearly defined. It remains to be 
seen whether consistencies, such as those shown 
for amount of reinforcement, are manifest over 
a wide range of role contexts. 

As predicted, one of the organismic variables 
related to reinforcing behaviors is mediated by 
birth order, viz., affiliative tendencies. The pres- 
ent results are consistent with previous reports 
by Schachter (1959) and Dember (1964), all 
of which indicate that the greatest amount of 
affiliative behavior is shown by female first-born 
(or only) children. It appears that first borns 
reflect their “expectancies” about being rein- 
forced in their own high output of reinforce- 
ments. That is, first-born (and only) children 
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are probably more responsive to differences in 
amount of reinforcement from others. Conners 
(1963, p. 415) has shown that for a homogeneous 
sample of college males the farther down one 
is in the sibling hierarchy (later borns) the less 
the expectancy of affiliative rewards. 

"The social significance of the reinforcing be- 
havior studied here was indirectly supported by 
the adjective check list results. If button pressing 
were only a manifestation of differences in motor 
output one would expect to find a relationship 
to other output variables, for example, total 
number of adjectives checked. While button 
pressing and checking adjectives were found to 
be reliable behaviors in their own right (es- 
sentially by test-retest correlations) they were 
unrelated. Likewise, if birth order were mediat- 
ing a general tendency for acquiescence then a 
relationship would be expected between it and 
total number of adjectives checked, especially 
total number of positive adjectives checked. 
Again, no relationships were found between 
birth order and adjectives checked. 

The relationship between amount of reinforce- 
ment, emitted to a speaker and whether one ex- 
presses “liking” for that speaker is rather com- 
plex. It will be recalled >that all r's between 
number of reinforcements and number of either 
positive or negative adjectives checked were 
positive in sign. The fact that a listener minimally 
reinforced a speaker was not necessarily an 
indication of expressed “disliking” of the speaker. 
In only one instance (males reinforcing a female 
speaker) was there a substantial relationship 
between reinforcement and expressed “liking.” 
It is concluded from these results that whether a 
speaker is described in favorable or unfavorable 
terms is largely independent of the prior rein- 
forcing behavior emitted to the speaker. 

On a common-sense basis one would expect a 
more direct relationship between "liking" and 
emitting reinforcing responses to a speaker. The 
instructional set used here, viz., “maintain rap- 
port with the speaker,” may also have activated 
response tendencies resulting from communalities 
in the socialization of reinforcing behaviors (see 
Weiss, 1964). Listeners might have ignored the 
more immediate sensory input (the affective 
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reaction generated by the speaker) and co 
plied with the instruction to facilitate . 
speaker’s talking. That which would be 
“appropriate reinforcing behaviors” remains f 
further study. Having shown that such behavior 
has consistency and validity we must now let 
at dyadic communication relationships in te 
of how the participants modulate their rt 
forcing behaviors. 
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MEDIATED GENERALIZATION OF ATTITUDE CHANGE 
VIA THE PRINCIPLE OF CONGRUITY ' 


PERCY H. TANNENBAUM 


University of Wisconsin 


4 sets of linkages with a single source and 2 concepts were first established— 
the source being for both concepts; for the 1st but against the 2nd; against 
the 1st and for the 2nd; and against both. Attitude toward the 1st concept 
was then manipulated—both favorably and unfavorably—without any mention 
of either the source or 2nd concept. It was reasoned from the principle of 
congruity that such manipulated change in the 1st concept would influence 
the source attitude (as a previous study had indeed demonstrated), and that 
this source change would in turn, produce appropriate modifications in the 2nd 
concept. The results provided strong confirmation of the theoretical expecta- 
tions, the source-mediated change in the 2nd concept apparently occurring over 
and above any direct transfer from the 1st concept. 


When attitude toward one of two objects 
in a cognitive relationship is modified, there 
often results a change in attitude toward the 
other object in order to maintain cognitive 
consistency. A particular instance of such a 
general consistency theory phenomenon is 
represented in applications of the principle 
of congruity (Osgood & Tannenbaum, 1955) 
where a persuasive communication directed 
at a given topic or concept also results in 
changes in attitude toward the message 
source. 

Such generalization of attitude change from 
concept to source was more clearly indicated 
in a recent extension of the congruity model 
to a situation in which the two main cognitive 
operations involved—the establishment of an 
evaluative relationship between source and 
concept, and the manipulation of the concept 
attitude—were accomplished independently 
(Tannenbaum & Gengel, 1966). Three dif- 
ferent source-concept linkages were first es- 
tablished—one source being for the concept, 
another against, and a third neutral. In a 
subsequent message, the concept was modified 
in either a favorable or unfavorable direction, 
but without any reference to the original 
sources. The resulting relative changes in atti- 
tude toward the sources were in accord with 
the theoretical predictions—evaluation of the 


1This research was supported under Grant 
G-23963 from the National Science Foundation. 
Gerard Leduc helped in the administration of the 
testing and in the data analysis. 


sources changed in the direction establishing 
a congruent relationship with the altered 
concept position, 

A further extension of the congruity prin- 
ciple as a model for generalization of per- 
suasion is readily apparent. If attitude change 
toward a source results from manipulating 
the attitude toward an evaluatively linked 
concept, then other concepts with which that 
source has been associated should also be 
affected, That is, a given source may be 
linked to a number of different concepts, 
each such linkage constituting a particular 
cognitive relationship. Change in one concept 
affects the source attitude because it intro- 
duces an inconsistency, or incongruity, into 
one of those relationships. But now the 
change in the source creates a new incongru- 
ity with one or another of the remaining 
concepts, attitude toward which should 
change in order to resolve that incongruity. 
In this manner, generalization of attitude 
change from one concept to another may be 
accomplished—mediated through their initial 
association with a common source, and in 
the absence of any direct link between the 
concepts themselves. 

The present experiment was designed to 
investigate such a phenomenon in terms of 
specific congruity principle predictions. Vari- 
ous directed relationships between a given 
source and two different concepts are first 
established, and then attitude toward one 
of the concepts is manipulated, either posi- 
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tively or negatively. This should result in 
favorable or unfavorable change in attitude 
toward the source, in accordance with con- 
gruity theory predictions. This source attitude 
change, in turn, should affect attitude toward 
the second concept, also in accord with spe- 
cific congruity predictions, Thus, three criti- 
cal variables are involved in such source- 
mediated generalization of persuasion from one 
concept to a second—the nature of the evalu- 
ative link between the source and the first 
concept, the direction of the manipulated 
attitude change toward the first concept, and 
the nature of the evaluative link between the 
source and the second concept. 


METHOD 
Subjects 


A total of 218 male high-school students attending 
a summer military camp as Army cadets served as 
subjects in the experiment, They ranged between the 
ages of 16-18 years, and participated in the study 
as part of a series of tests they underwent during 
their stay in camp. Because of some incomplete 
participation, the data for 200 subjects were used. 

» 


Procedure 


Materials similar to those employed in the pre- 
vious Tannenbaum and Gengel (1966) study were 
used in the present investigation. On the basis of 
previous testing with undergraduate college students, 
two concepts—teaching machines (TM) and Spence 
learning theory (LT)—and a plausible but fictitious 
source (Prof, Walter E. Samuels of the University 
of California) were selected as the attitudinal objects 
for the study. The main criteria for selection were 
relative neutrality of initial attitude, and a relatively 
small variance. 

Subjects were first tested on attitude toward the 
three objects (To) as part of a general inventory 
of attitudes and connotative meaning judgments, the 
objects of interest here being imbedded among a 
set of 20 different concepts. One week later, subjects 
were again assembled and were divided into four 
groups according to the intended source-concept 
linkages. One group had the source in favor of both 
the TM and LT concepts (the pp condition); for 
another, the source favored TM but disfavored LT 
(pn); for the third, the source was against TM but 
for LT (np); for the fourth, the source was against 
both concepts (nn).? 


2 Neutral linkage conditions were not included in 
the present design—partly because they were not 
absolutely essential to test the main theoretical 
predictions, and partly because the results of the 
Tannenbaum and Gengel (1966) study indicated a 
possibility of a contamination of the generalization 
data, as such, in a neutral linkage situation. 
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All subjects then participated in a totally separate 
task—completing several parts of the MMPI—for 
approximately one-half hour. After this interval of 
irrelevant activity, subjects were exposed to the TM 
concept manipulation message. Half the subjects in 
each linkage condition received a positive version 
designed to boost the TM attitude (the P treat- 
ment), and the other half received a belief attack 
(the N treatment). Attitudes toward the source and 
the two concepts were again assessed after the 
experimental messages (Ti). Thus, the basic design 
was a before-after 4 X 2 (Linkages X Manipulations) 
factorial design with independent cells (n — 23 per 
cell). 


Experimental Materials 


Linkage messages. The various linkages were es- 
tablished in messages (of approximately 250 words 
each) reporting an ostensible symposium at the 1963 
convention of the American Psychological Associa- 
tion dealing with “new educational procedures." To 
heighten interest somewhat, the message claimed the 
symposium was a widely discussed one which had 
“caused quite a stir... and considerable debate.” 
Actually, the message only mentioned Prof. Samuels' 
position either for or against teaching machines or 
the Spence learning theory, without reference to 
other aspects of the alleged symposium. In order 
not to single out Prof. Samuels unduly the message 
also mentioned another individual, Prof. George L. 
Maclay of Cornell, as chairman of the symposium. 

The main purpose of these messages Was merely 
to establish the position of the source of both con- 
cepts. Accordingly, there was a minimum of further 
embellishment of information about either of the 
concepts or of further detail of the source's position. 

To establish the favorable TM connection, the 
message merely stated: 


Professor Samuels, a strong proponent of teaching 
machines, praised the use of teaching machines for 
instructional purposes in no uncertain terms. Be 
hailed teaching machines as “the most significant 
single contribution of the behavioral sciences Jn 
the field of education.” 


The unfavorable TM version was almost identical 
in wording except that “opponent” was substitute! 
for “proponent,” and “attacked” for “praised.” The 
direct quotation has Samuels hailing teaching m4- 
chines as a “most pernicious influence on the entire 
educational system and a source of shame for 
behavioral sciences.” 

In either case, the connection with the second 
concept was made immediately after the pure 
dealing with teaching machines. The favorable i 
version read: 

At the same time, Professor Samuels also 
expressed himself as strongly in favor of 
Spence Learning Theory. Known as a vigor 
supporter of the Spence ‘Theory, he called it the 
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most compelling explanation of the learning 
process yet presented." 


The unfavorable LT connection was again highly 
similar. The word “against” was substituted for “in 
favor of,” and “antagonist” for “supporter,” with 


the quotation calling the theory “a veritable hodge- © 


podge of unfounded notions with no meaningful 
basis." 

Manipulation of concept attitude. Teaching ma- 
chines was selected as the concept to be manipulated, 
with Spence learning theory as the secondary concept 
to which generalization would be investigated— 
largely because TM materials were already available. 
Both the positive and negative treatments of atti- 
tude toward TM were accomplished in messages 
purporting to be copies of an Associated Press 
article dealing with “a comprehensive report on 
teaching machines from the U. S. Office of Edu- 
cation,” and were similar to those used in the 
Tannenbaum and Gengel (1966) study. The articles 
were about equal in length (approximately 475 
words) and very similar in format, each stating 
their respective position on TM and citing a half- 
dozen more strongly worded arguments—with liberal 
quotation from the alleged report—in support of 
that position. In neither case was the source of the 
linkage message at all mentioned. 


Attitude Measure 


Four semantic differential scales (cf. Osgood, Suci, 
& Tannenbaum, 1957), imbedded in a total set of 
10 such scales, were used to assess attitude. These 
were selected on the basis of a factor analysis of 
the present data to be most representative of the 
evaluative factor on the particular attitudinal objects 


3 Copies of the different experimental materials 
involved may be obtained from the Mass Com- 
munications Research Center, University of Wis- 
consin, Madison, Wisconsin 53706. 


495 


involved. They included: good-bad, worthless- 
valuable, successful-unsuccessful, and  important- 
unimportant. The sum of ratings across all four 
scales, adjusted for consistency of attitudinal direc- 
tion, constituted the attitude measure at both the 
To and T; test sessions, with the T; — To difference 
serving to index the dependent variable of attitude 
change. 


RESULTS 


The postulated mechanism for generaliza- 
tion involves a three-stage process. The TM 
attitude must first be altered by the experi- 
mental manipulations. This should affect the 
source attitude in accord with congruity prin- 
ciple predictions. Then, the critical third 
stage—change in the dependent LT concept 
—can be properly assessed. We will consider 
each in turn. 


Change on Manipulated Concept (TM) 


The primary intention of the manipulation 
messages was to change attitude toward the 
TM concept—in a favorable direction for the 
positive (P) treatment, and unfavorable for 
the negative (N) treatment. Table 1 reports 
the mean change scores for the different 
conditions and indicates a highly significant 
difference between the two experimental 
treatments. A separate analysis showed a 
significant (p < .001, in each case, by sign 
test) shift within each treatment. 

The lack of a significant difference between 
the linkage conditions, along with the in- 
significant interaction effect, are to be ex- 


TABLE 1 
MEAN ATTITUDE CHANGE IN MANIPULATED Concert (TM) AND RESULTS OF ANALYSIS OF VARIANCE 


Source-concept linkages 
TM concept Marginals 
manipulation 
pp pn np nn 
Positive (P) --8.56, +9.12. +7.32 +7.16, +8.04 
Negative (N) 16396, Sino; —1.85 7824, —1.85 
Marginals + .80 +1.00 — 26 — 54 
Source df MS F 
Between linkages 3 23.15 — 
tween manipulations 1 12,136.82 268.28* 

Interaction 3 5.70 E 
Within cells 192 45.24 


man-Keuls test (cf, Winer, 196 80-85), 
ep UD iner, 1962, pp, ) 


Note.—Means with the same alphabetical subscript are not significantly different from one another at the ,05 level by New- 
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pected at this stage, since the linkage condi- 
tions should have little relevance to change 
on the TM concept, as such. Actually, some 
influence of the linkage message is apparent: 
The np and nn conditions, which involved 
some negative statement about the TM con- 
cept in the prior linkage messages, do not 
exhibit quite as much positive change in the 
P treatment as do the pp and pn linkages. 
Similarly, the latter conditions are not quite 
as negative in the N treatment. In all 
cases, however, these differences are short of 
statistical significance. 


Source Attitude Change 


This intermediate stage of the present study 
is similar to the main focus of the previous 
Tannenbaum and Gengel (1966) study, and 
accordingly the same predictions, derived 
Írom congruity theory, apply. For example, 
where there was a positive linkage between 
the source and TM and then TM was 
changed negatively, we would expect the 
source to change in a regative direction to 
maintain congruity. In this manner, favor- 
able source attitude change is predicted for 
the pp and pn conditions under the P treat- 
ment, and in the np and nn conditions under 
the N treatment, The situation is reversed for 
prediction of negative source change—in the 
np and nn conditions for P, and the pp and 
pn conditions for N. It should be noted that 
change in source attitude is predicted solely 
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on the basis of its linkage with the initially — 
manipulated TM concept, and is independent ' 
of its association with the second, non. | 
manipulated concept. 

Table 2 clearly indicates that the various 
source changes occur as predicted. There isa ' 
highly significant interaction effect, but the 
overall main effects are not significant. It is 
obvious, however, that in both cases, this is 
due to a canceling-out of significant changes 
in opposing directions within a given row or 
column, These findings are in substantial 
accord with the results of the previous study. 


Change on Nonmanipulated Concept (LT) 


Given that the first two stages of the hy- 
pothesized process functioned as expected, 
the data on the main dependent variable may 
be analyzed. The relevant variables here are 
the new source attitude in each condition and 
the nature of the initial linkage between the 
source and the nonmanipulated concept. 
Where the source has become more favorable, 
its position vis-à-vis the LT concept should 
be reflected in the actual LT change—in a 
negative direction if the linkage was a nega- 
tive one, and a favorable change if the link- 
age was a positive one. On the other hand, if 
the source attitude has altered in an unfavor- 
able direction, then the LT change should be 
opposite to that advocated by the source— 
that is, a favorable change where the linkage 
was negative, and an unfavorable change 


TABLE 2 
MEAN ATTITUDE CHANGE ON SOURCE (SaMUELs) AND RESULTS OF ANALYIS OF VARIANCE 


TM concept Source-concept linkages $ 
manipulation Marginals 
pp pn np nn 
Positive (P) +4.96, +4.72, 2.52, 

Y HOS ET —2.96 +1.05 
Negative (N) —3.96, 13.00; 1376. 4332, + 12 
Marginals + .50 + .86 + .62 + 18 

Source of MS F 
Between linkages 3 4.00 = 
Between manipulations 1 52.02 2.35 
Interaction 3 891.17 40.25* 
Within cells 192 22:14 > 


Note.—Means with the same alphabetical subscript are not significantly different from one another at the ,05 level by New" 


man-Keuls test. 
*p «.001, 
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TABLE 3 
MEAN ATTITUDE CHANGE ON NONMANIPULATED CONCEPT (LT) AND RESULTS OF ANALYSIS OF VARIANCE 
Source-concept linkages 
TM concept ; 
manipulation Marginals 
pp pn np nn 
Positive (P) +2.84,, 22102; — 16, +1.56, +.43 
Negative (N) —2.72, + 604 +2.12, —2.52, —.63 
Marginals + .06 — .66 + .68 — 48 
Source af MS F 

Between linkages 3 18.20 1.48 
Between manipulations 1 56.18 4.56* 
Interaction 3 240.45 19.52 
Within cells 192 12.32 


Note,— Means with the same alphabetical subscript are not significantly different from one another at the .05 level by New- 


man-Keuls test. 


where it was positive. By applying the con- 
gruity principle in this manner, we would 
thus predict favorable LT changes in the 
Ppp, Nnp, Npn, and Pnn conditions, and un- 
favorable LT changes in the Ppn, Nnn, Npp, 
and Pnp conditions. 

The relevant LT change data are presented 
in Table 3, and indicate that all changes are 
in the predicted directions. The four cells in 
which a positive change was anticipated on 
the basis of the congruity formulations all 
change in that direction, with the differences 
between them being not significant, Similarly, 
the predicted negative changes obtain, again 
without significant differences among them. 
However, the differences between matched 
pairs of positive and negative changes are 
always significant. Indeed, the only lack of 
significant difference between amy pair of 
positive and negative means is between the 
least changing positive one, Npn, and the 
smallest negative one, Pnp. Comparing the 
two groups within a given linkage condition 
—for example, Ppp versus Npp, and so on— 
We find them to change, as anticipated, in 
Opposite directions and to differ significantly 
from one another. 


Discussion 


It is clear that the results confirm the 
theoretical predictions and hence provide sup- 
Port for the proposed theoretical model. It is 


important to note that within such a model, 
the change on the second concept may be in 
a direction opposite to that on the manipu- 
lated first concept—and, for that matter, the 
source attitude change may also be the re- 
verse of that on the first concept. This is in 
distinction to other instances in which the 
mediated transfer of evaluation has been ex- 
hibited through conventional conditioning 
procedures, For example, Staats, Staats, and 
Heard (1959) used a semantic generalization 
paradigm, and found that the association of 
highly evaluated terms with stimulus words 
spread to synonyms of those stimulus words. 
Even more to the point, Das and Nanda 
(1963), in a sensory preconditioning design, 
found that the association of the evaluative 
words “good” and “bad” with nonsense sylla- 
bles influenced the judgment of tribal names 
which had been previously linked to the non- 
sense terms. In such cases, the judgment of 
the “conditioned stimulus” (e.g., the tribal 
names) was always the same, in attitudinal 
direction if not in degree, as the already es- 
tablished evaluation of the “unconditioned 
stimulus" (e.g., the words “good” and *bad"). 

That such a direct transfer of attitude 
change may be taking place in the present 
experimental situation is suggested by the 
difference in LT change as a function of the 
difference in manipulation of the TM concept, 
as indicated by the analysis of variance in 
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Table 3. That is, where the initial TM change 
was positive (P manipulation) the overall LT 
change was also positive; there is a similar 
correspondence in the negative manipulation 
condition, the difference between the two 
manipulations being significant (? < .05). 
While this finding was not expected—the TM 
concept manipulation treatment, as such, was 
presumably not a relevant variable affecting 
LT change—it does not detract from the re- 
sults in terms of the postulated congruity 
model, which can still function over and 
above any direct generalization from one con- 
cept to the other. 

This is readily apparent when the means for 
selected individual cells are examined. If only 
direct transfer of change were operating, we 
would expect that in the P manipulation 
treatment, all four cells would change posi- 
tively. This is clearly not the case here—the 
Ppn and Pnp actually change in a negative 
direction, and are significantly different from 
the positively changing Ppp and Pnn groups. 
The same is true for the N manipulation treat- 
ment of the first concept —the Npn and Nnp 
cells change positively rather than negatively 
and again are significantly different from the 
corresponding negatively changing Npp and 
Nnn cells, It is in these differential linkage 
conditions—the pn and np linkages under 
both the P and N manipulations—that the 
critical distinctions are to be found, with the 
results completely in accord with the con- 
gruity model predictions. Indeed, it is in these 
conditions that the postulated mediated gen- 
eralization model had to operate in contradic- 
tion to the direct generalization effects—thus 
making the obtained findings all the more im- 
pressive, 

Thus, though there is some evidence of di- 
rect generalization, the obtained results are 
best explained in terms of the congruity model 
which generated the study: Change in atti- 
tude toward an unmanipulated concept is a 
direct consequence of the tendency to main- 
tain the various source-concept relationships 
involved in a psychologically harmonious or 
congruent state, At times, such change is in 
the same direction as that of the manipulated 
concepts, at times it is in the opposite direc- 
tion—but it is always in that direction which 
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makes for a congruent situation. Such a theo- 
retical formulation might also be applied to 
explain the results obtained by Weiss (1957) 
in a study which had some similar properties 
to the present one. He found that the prior 
establishment of a negative source favoring a 
presumably highly valued concept facilitated 
the impact of a subsequent message in which 
that source attacked another concept. Uncer- 
tain as to “the nature of the facilitating psy- 
chological processes," Weiss dismissed an in- 
crease in source “trustworthiness” given the 
particular source employed (the Daily 
Worker), and tended to favor the apparent 
“opinion congruence” between the source and 
the subject created by the first message as the 
critical factor. This notion, not further devel- 
oped by Weiss, appears highly similar to the 
theoretical rationale derived more explicitly 
from the congruity principle—which is prob- 
ably the main reason why such similar labels 
were used, apparently independently. 

The basic model, of course, also applies to 
the determination of changes noted in the 
intermediate stage of this investigation— 
change in the source attitude. As with the 
concept-to-concept change situation, the di- 
rection of the source change may or may not 
be in the same direction as the initial concept 
change, depending on the particular congruity 
conditions, In this sense, the results of the 
present experiment are even more impressive 
in their agreement with the basic theoretical 
predictions than were the findings in the 
earlier study (Tannenbaum & Gengel, 1966), 
where all the mean source-attitude changes 
were in a favorable direction. A number of 
possible reasons for this difference are sug- 
gested by some methodological differences be- 
tween the two studies—for example, the use 
of somewhat different messages, deliberately 
rewritten for the entire passages to make 
them more persuasive, in the present study; 
use in the present study of perhaps more im- 
pressionable high school, as opposed to col- 
lege, subjects; a basic design change from the 
use of different sources within the same group 
to represent the different source-concept link- 
ages, to the use of the same named source m 
different groups. Each of these alternatives 
may have allowed for a clearer manifestation 
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of the generalization phenomenon under 
study, but these and others remain purely 
speculative for the present. 
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REDUNDANCY IN IMPRESSION FORMATION 


DAVID S. DUSTIN anp PATRICIA M. BALDWIN * 


University of Texas 


In 3 studies, 40, 408, and 256 college undergraduates made evaluative ratings 
of hypothetical persons described by personality adjectives A, B, or A and B. 
12 adjective pairs were rated in the Ist study, 6 in the 2nd, and 2 in the 3rd. 
The extent to which each adjective in an A-B pair implied the other (ie., 
redundancy) was varied by selection of adjective pairs in the first 2 studies 
and by pretraining Ss in the 3rd study. Results from all 3 studies confirmed 
the predictions that the rating of an AB person would tend to be more ex- 
treme than the mean of the ratings of an A person and a B person, and that 
this tendency would be greater the less redundant A and B. 


Asch (1946), in the article which initiated 
the study of impression formation, claimed 
that a total impression of, or scale value 
assigned to, a person is something more com- 
plex than a simple additive combination of 
the scale values of the separate items of in- 
formation about that person. 

Tn contrast, the current interest of students 
of impression formation seems to be focused 
squarely upon just such simple additive 
models, For example, Fishbein and Hunter 
(1964) recently raised the question whether 
a “mean” or a “summation” model is more 
applicable to impression formation. Both are 
simple additive models, differing only in that 
the mean model weights each separate item of 
information by 1/N, the inverse of the num- 
ber of items, while the summation model 
assigns each item a weight which is larger 
than 1/N, but presumably no larger than 
unity. 

However, such simple additive models prob- 
ably are inadequate to handle some impres- 
sion formation data because they do not take 
into account relationships (i.e., redundancy) 
existing among the items of information go- 
ing into an impression. One way of solving 
this problem is to apply these models only 
to unrelated items of information (Levy & 
Richter, 1963; Wishner, 1960); another is to 
elaborate the simple additive model some- 
what to take item redundancy into account. 
It is the purpose of this paper to demon- 
strate the feasibility of this latter course. 

1The authors wish to express their appreciation 


to J. D. Jecker and J. C. Loehlin for their critical 
readings of an earlier draft of this paper. 


A model which takes into account relation- 
ships or redundancy among items of infor- 
mation going into an impression is described, 
for the case of two items, by the following 
equation: 


Vas = WV a + wV g — wsRan 
where: 


V =a value on some dimension; in this paper, a 
positive or negative value on an evaluative dimen- 
sion 

A, B=single items of information; here, single 
personality adjectives attributed to separate persons 

AB —two items of information in combination; 
here, the personality adjectives A and B attributed 
to the same person 

R,p-the degree of positive or negative im- 
plicative relationship between Items A and B; while 
individual measures of R45 could presumably be 
obtained, group measures are used here 

w=a weighting coefficient 


It perhaps should be emphasized that while 
the relationship term has, in the above equa- 
tion, the position often assigned to tautologi- 
cal error or correction terms, it is the present 
intention to measure this term independently. 

The above equation implies that the addi- 
tion of a positive trait B to the characteriza- 
tion of an individual already known to have 
positive trait 4 would increase the overal 
evaluation of the individual to the extent that 
the information conveyed by B was not al- 
ready implied by A—that is, not redundant. 
In this paper, redundancy is considered to 
increase as the implicative relationship be- 
tween A and B moves from extreme negative; 
through zero, to extreme positive. 2 

The equation in its above form is applica- 
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ble when the mean evaluation of A and B is 
positive, When this value is negative, it is 
necessary to reverse the sign of the R term 
so that redundancy tends to reduce the extent 
to which the values of two negative traits 
add together. 

While relatively little attempt is made here 
to argue for any specific set of weights, it is 
interesting to consider that if the weighting 
coefficients of the values of Items 4 and B 
were set at unity, and R45 were weighted by 
the mean value of 4 and B, the model would 
imply a V4z which is the mean (M) of V4 
and Vg when R4; is perfect and positive, 
which is a sum (2M) when Raz is zero, and 
which is more extreme than a sum (3M) 
when R4; is perfect and negative. 

Thus, if Rap took any value less positive 
than that of +1, the model when supplied 
with the above weights would lead to the two 
hypotheses tested in this paper: first, the 
evaluation of adjective combination AB will 
tend to be more extreme than (but have the 
same sign as) the mean of the separate eval- 
uations of 4 and B; and second, this tendency 
will be greater the less redundant A and B. 


Srupy 1 
Method 


Stimuli. With the aid of adjective intercorrela- 
tions reported by Osgood, Suci, and Tannenbaum 
(1957) and Wishner (1960), the investigators se- 
lected six pairs of personality adjectives such that 
each adjective was positively related to the adjective 
“good,” and the intrapair relationships were expected 
to vary widely in strength. The selected adjective 
pairs were relaxed-calm, courageous-brave, impor- 
tant-successful, sensitive-interesting, polite-practical, 
and cautious-warm. Also used were the negatively 
evaluated opposites of the above adjectives: tense- 
agitated, timid-cowardly, unimportant-unsuccessful, 
insensitive-boring, blunt-impractical, and impulsive- 
cold. 

Procedure. Each subject completed a question- 
naire containing 36 statements of the type “A per- 
son who is is probably:” by making ratings 
on four 12-point bipolar scales (scored 5.5 to —5.5) 
—good-bad, strong-weak, and two additional scales 
defined by the two relevant paired adjectives and 
their opposites. 

Two statements were printed on each page of the 
questionnaire. For half the subjects, on each of the 
first six pages the first statement was filled in with a 
Pair of positively evaluated adjectives (e.g, relaxed 
and calm), while the second statement was filled in 
with the pair of negatively evaluated opposites 
(eg. tense and agitated). For the same subjects, on 
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each of the last 12 pages the first statement was 
filled in with a single positively evaluated adjective 
(e.g, relaxed), while the second statement was 
filled in with the negatively evaluated opposite (e.g., 
tense). The other half of the subjects rated the ad- 
jectives first singly and then in pairs. 

Half of each of these groups was presented with 
the combined adjectives in the AB order while the 
other half was presented with them in the BA 
order. Except for the above restrictions, the order 
of questionnaire pages was randomized, 

Subjects. Forty-two students enrolled in a junior 
level psychology course at the University of Texas 
filled out the questionnaires during a course labora- 
tory period. After two incomplete questionnaires 
were discarded, data for 20 men and 20 women 
remained for analysis, and the sex variable was 
orthogonal to each of the order of presentation 
variables, 


Results 


For each of the six adjective pairs two 
point-biserial correlations (McNemar, 1955) 
were computed: one from the ratings of A 
and its opposite (e.g., relaxed-tense) on a 
scale defined by B and its opposite (e.g., calm- 
agitated), and the other from the ratings of 
B and its opposite on a scale defined by A 
and its opposite. The two coefficients were 
then averaged using the z-transformation 
method (McNemar, 1955, p. 148). The re- 
sulting mean correlations are shown in the 
second column of Table 1. 

Next, overall means were computed on the 
good-bad and active-passive ratings of the 
adjectives separately and in pairs, Unexpec- 
tedly, the mean separate evaluations for all 


TABLE 1 


EVALUATIVE RATINGS FOR 12 PAIRS OF ADJECTIVES 
SEPARATELY AND IN COMBINATION 


M evaluation of 
Adjectives (A — B) M rob m 
1 
(A +B)/2 
Relaxed-calm 947 2.18 
Courageous-brave .848| 2.42 
Important-successful .766| 1.99 
Sensitive-interesting «689 | 2.38 
Polite-practical 446] 2.14 
Cautious-warm —.367 | 2.49 
Tense-agitated -947| 0.50 
Timid-cowardly «848 0.74 
Unimportant-unsuccessful -766 1.00 
Insensitive-boring «689 0.11 
Blunt-impractical 4446 0.55 
Impulsive-cold —.367 0.48 


Note.—The mean evaluations in this table are located on a 
good-bad scale ranging from 5.5 to —5.5, 
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six of the “negatively evaluated" pairs tended 
to be slightly positive, as shown in the lower 
half of column three, Table 1. It was assumed 
that these values were actually negative with 
respect to a subjective scale midpoint which 
was somewhat above the zero point of the 
scale. This assumption receives support in 
Study 2. 

Table 1 shows that, as hypothesized, the 
mean evaluation of adjective combination 
AB was more positive than the mean evalu- 
ation of (A + B)/2 for each of the six posi- 
tive adjective pairs, and more negative for 
each of the six negative adjective pairs. If 
positive and negative separate-combination 
differences were equally likely, the probabil- 
ity of obtaining six successive differences hav- 
ing the same sign would be .03. 

Spearman  rank-difference correlations 
(rhos) were run between the ranks of the 
correlations and the ranks of the separate- 
combination differences. The rho involving 
the good-bad ratings of the six negative ad- 
jective pairs was .94 (p< .05, two-tailed), 
which supported the hypothesis, However, 
neither the rho involving the good-bad rat- 
ings of the positive pairs, nor the rhos involv- 
ing the active-passive ratings approached 
significance, 

With regard to the evaluative ratings, the 
finding that the predicted relationship held 
for the negative adjectives but not for the 
positive ones, plus the likelihood that the sub- 
jective scale midpoint was somewhat above 
scale zero, suggested the possibility that in 
the case of the positive adjectives only, a 
ceiling effect was operating to prevent the 
ratings of the combination from becoming 
more extreme as the pair intercorrelations de- 
creased, 


Stupy 2 


A second study was conducted to further 
test the hypotheses and to find out whether 
nonevaluative ratings might show the pre- 
dicted effects if subjects were given a smaller 
burden of ratings than was the case in Study 
1. Since it seemed possible that there are 
special difficulties in testing the hypotheses 
using positively evaluated adjectives, only 
negatively evaluated adjectives were used in 
the second study. 
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Method 


Stimuli. Three sets of negative adjectives were 
selected, each set conforming to the pattern AB-AC. 
That is, each set consisted of two adjective pairs 
having one adjective in common. One of these ad- 
jective pairs was selected, with the help of Webster's 
Dictionary of Synonyms (1951), to be highly inter- 
related, while the other was selected, with the help 
of adjective intercorrelations reported by Osgood 
et al (1957), to be as unrelated as possible. The 
AB-AC sets were: unlenient-unforgiving, unlenient- 
inconsistent; ungenerous-uncharitable, ungenerous- 
unstable; and passive-unenergetic, passive-uncalm. 

Procedure. Subjects completed statements of the 
type “When I imagine a _____ person, I imagine a 
person who is probably:” by making ratings on 
five 7-point bipolar scales’ (scored 3 to —3)— 
pleasant-unpleasant, rugged-delicate, cruel-kind; a 
scale which, when a single adjective was being 
rated, was defined by the paired adjective and its 
opposite and which, when an adjective combination 
was being rated, was defined by impractical-prac- 
tical and nice-awful. The first, third, and fifth of 
these scales were taken from the evaluative dimen- 
sion of the semantic differential (Osgood et al, 
1957), and in analysis each subject’s ratings on 
these scales were summed to give one evaluative 
score, 

All subjects warmed up by rating a “polite” per- 
son and an “average” person on the first two pages 
of the questionnaire. The stimulus adjectives pre- 
sented on the remaining page(s) depended on 
which of 12 groups the ‘subject was in; for each of 
the three AB-AC sets, one group rated an A (eg; 
ungenerous) person and a B (eg. uncharitable) 
person, a second group rated an A (eg. ungenerous) 
person and a C (e.g. unstable) person, a third group 
rated an AB (e.g., ungenerous and uncharitable) per- 
son, and a fourth group rated an AC (e.g, ungen- 
erous and unstable) person. Á 

The two terms in each adjective combination 
were always presented in the order in which they 
are given above; adjectives rated separately were 
varied in order of presentation, although no attempt 
was made to ensure that equal numbers of subjects 
received the two orders of presentation. 

Subjects. During class time, 455 men and women 
in an introductory psychology class at the Univer- 
sity of Texas participated in the study. To equalize 
cell sizes, 47 subjects were randomly discarded, 
leaving 34 subjects in each of the 12 conditions, or 
a total of 408 subjects. 


Results 


To determine whether subjects perceived 
the intended differences in relationship be- 
tween high-related and low-related pairs, for 
each adjective set a £ test was run on the 
difference between the AB subjects’ mean 
rating of B (e.g., uncharitable) on a scale de- 
fined by A and its opposite (e.g., generous- 
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ungenerous) and the AC subjects’ mean rat- 
ing of C (e.g., unstable) on a scale defined 
by A and its opposite (e.g., generous-ungen- 
erous). For all three comparisons, the differ- 
ences were in the intended direction, and 
were significant at the .01 level. 

As in Study 1, the means of the mean 
separate evaluative ratings of a number of 
adjective pairs were unexpectedly above the 
scale zero point; specifically, Table 2 shows 
that this mean was .94 for passive-unener- 
getic, and .65 for passive-uncalm. However, 
the overall mean evaluative rating of an 
"average" person—which might be consid- 
ered to represent the subjective midpoint of 
the scale—was 3.28, high enough to make all 
the means in Table 2 subjectively negative. 

A two-way analysis of variance performed 
on the combination and mean separate evalua- 
tive ratings indicated that, as hypothesized, 
the combination ratings were significantly 
(F = 67, df = 1/396, p < .01) more negative 
than the mean separate ratings and that 
there was a significant (F = 4.5, df = 1/396, 
p < .05) interaction between high-low rela- 
tion and separate-combination rating which 
conformed to hypothesis. That is, as can be 
seen in the last column of Table 2, the sepa- 
rate-combination differences tended to be 
more positive for the three low-related adjec- 
tive pairs than for the three high-related pairs. 

Finally, the high-low relation main effect 
was significant (p < .01), with the low-re- 
lated adjectives being evaluated more favor- 
ably than the high-related adjectives. The 
main contributor to this effect was the fact 


that the mean separate evaluations were sig- 
nificantly (< .01) more negative in the 
high-relation condition than in the low-rela- 
tion condition. This suggests the possibility 
that the interaction reported above was due 
not to redundancy but to a ceiling effect 
whereby there was not as much room for the 
combination ratings to be more negative than 
the mean separate ratings in the case of the 
high-related adjectives as there was in the 
case of the low-related adjectives. However, 
while a ceiling hypothesis would imply that 
the separate-combination difference would 
decrease as the mean separate evaluation be- 
comes more extreme, the data within high- 
and low-relation conditions tend to show the 
opposite kind of relationship. Therefore, since 
there is no evidence for a ceiling effect op- 
erating within high- and low-relation condi- 
tions, it seems unlikely that such an effect 
was operating between those conditions to 
produce the obtained interaction, 

A two-way analysis of variance on the 
“rugged-delicate” ratings showed no signifi- 
cant effects, The fact that these negative re- 
sults could not be attributed to poor discrim- 
ination by subjects overburdened with ratings 
enhanced the possibility that they were due 
to the investigators’ failure to select stimulus 
adjectives related to rugged-delicate, 
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The results of the first two studies seemed 
less than ideal in at least two respects: first, 
neither study provided complete support for 
the hypotheses with respect to positive adjec- 


TABLE 2 


EVALUATIVE RATINGS FOR THREE HiGH-RELATED AND THREE Low-RELATED PAIRS 
or ADJECTIVES SEPARATELY AND IN COMBINATION 


M evaluation of 


Adjectives (A — B) Relationship 
A B (A f^ o 0-0 

Unlenient-unforgivin, High —2,82 —4.85 —3.84 —5.56 1.72 

fagenerous-uncharitable High —2.97 —3.59 DA m ki "Ed 

assive-unenergetic High 1.68 0.21 z 71 fe 
Unlenient-inconsistent Low —282 1.50 zi 05 TF 4:56 2 is 

Ngenerous-unstable Low —3.18 —1.09 —243 E 3 43 

assive-uncalm Low 245 —0.85 0.65 —0.24 0.89 


Note.—The mean evaluations in this table were computed on sums of ratings on three evaluative scales ranging from 3 to —3, 


Therefore the potential rañge of the means is 9 to —9. 


504 


tives; second, different pairs of adjectives 
differing naturally in their interrelationships 
were used in both studies, leaving open the 
possibility that some characteristic of the 
adjective pairs other than their relationships 
was responsible for the obtained results. 
Therefore, a third study was conducted in an 
attempt to correct these deficiencies. First, 
positively evaluated adjectives were used as 
well as negatively evaluated ones; second, the 
relationships between the paired adjectives 
were experimentally manipulated, making it 
possible to use the same pairs of adjectives in 
both high-relation and low-relation conditions. 


Method 


Stimuli. Two pairs of adjectives were used: tall- 
fair and tall-unfair. The adjectives fair and unfair 
were used because of their strong evaluative impli- 
cations. “Tall” was used because it was assumed to 
be initially unrelated to fair or unfair and to the 
evaluative dimension, 

Procedure. In an initial training period, the ex- 
perimenter read aloud, in à predetermined mixed 
order, the heights ("tall or “short”) of 28 indi- 
viduals said to have been randómly selected from a 
"particular group on the University of Texas cam- 
pus." After each height was read, the subjects indi- 
cated whether they guessed that individual was fair 
or unfair (“in dealing with others") by checking one 
of two boxes on their questionnaires. After each 
guess, the experimenter informed the subjects of 
the correct answer, and the subjects who had 
checked the wrong box crossed out their check marks 
and checked the alternate box. Subjects in the fair- 
implication condition were told that 9 of the 10 
tall persons were fair and 1 was unfair. Subjects in 
the unfair-implication condition were told that 9 
of the 10 tall persons were unfair and 1 was fair. 
All subjects were told that 9 of the 18 short indi- 
viduals were fair and 9 were unfair. After the in- 
formation about the 28 persons had been presented, 
subjects counted the numbers of tall and fair, tall 
and unfair, short and fair, and short and unfair 
persons indicated by their check marks. If any one 
of a subject’s four totals was in error by more than 
one unit, that subject was eliminated from the 
analysis, 

In the immediately ensuing testing period, subjects 
were asked to rate two or three “additional mem- 
bers of the same group.” Half the subjects rated the 
adjectives tall-fair, while the other half rated tall- 
unfair. Half of each of these groups rated the two 
adjectives separately, while the other half rated 
them as attributed to one person. 

The ratings were made on 7-point bipolar semantic 
differential (Osgood et al., 1957) scales, scored 3 to 
—3. Both separate and combined adjectives were 
rated on the following scales: pleasant-unpleasant, 
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active-passive, kind-cruel, and nice-awful. In addi- 
tion, each subject rated a "tall" person on a scale 
defined by fair-unfair, and a “fair” or “unfair” 
person—whichever adjective was a member of the 
pair being rated—on a scale defined by tall-short. 

In summary, there were eight subjects in each of 
32 conditions formed by the factorial combination 
of five dichotomous variables: implication (fair 
versus unfair), adjective pair (tall-fair versus tall- 
unfair), rating (separate versus combination), ad- 
jective order, and sex of subject. 

Subjects. Introductory psychology students vol- 
unteered for the study to fulfill a class require- 
ment and were run in groups which ranged in size 
from 24 to 46. Of the 288 subjects run, 14 were 
eliminated for incorrectly totaling the information 
given during the training period, and an additional 
18 were randomly discarded to equalize cell sizes, 
leaving the 256 subjects called for by the design. 


Results 


To find out if subjects perceived the in- 
tended relationships between height and fair- 
ness, an analysis of variance was performed 
on the ratings of a “tall” person on the fair- 
unfair scale. As intended, the fair-implication 
subjects rated a tall person as more fair than 
did the unfair-implication subjects (F — 387, 
df = 1/252, p< .01). 

As in Studies 1 and 2, the problem of lo- 
cating the effective scale midpoint arose. 
That is, as shown in the fifth column, Table 
3, the mean of the mean separate ratings of 
tall-fair by the fair-implication subjects was 
unexpectedly positive (1.00). This problem 
was overcome by assuming that the subjec- 
tive scale midpoint was 3.28—the mean rat- 
ing by Study 2 subjects of an “average” per- 
son on scales identical to the present ones. 
However, adopting this midpoint made the 
mean of the mean separate ratings of tall- 
fair by the unfair-implication subjects (1.61) 
unexpectedly negative. Even so, it seemec 
reasonable to expect the hypotheses for posi- 
tive mean separate ratings to apply in this 
special situation, where combining tall and 
fair would be expected to eliminate the nega- 
tive connotations of “tall” which depended 
on the tall-unfair implication, but to produce 
no compensatory negative change in the con- 
notation of “fair.” 

An analysis of variance was performed on 
the combination and mean separate evaluative 
ratings. As hypothesized, the combination 
ratings tended in all conditions to be more 


E 


REDUNDANCY IN IMPRESSION FORMATION 


505 


TABLE 3 


EvALUATIVE RATINGS FOR TALL-FAIR AND TALL-UNFAIR, SEPARATELY AND IN COMBINATION, 
AS A FUNCTION OF THE TRAINED RELATIONSHIP BETWEEN THESE ADJECTIVES 


M evaluation of 


Trained 
Adjectives (A — B) implication 
oi 
A B a Paz £h 0-0 

Tall-fair Fair 5.66 5.28 5.47 5.84 —0.37 
Tall-fair Unfair —2.09 5.31 1.61 3.69 —2.08 
Tall-unfair Fair 4.72 —2.72 1.00 —2.72 3.72 
Tall-unfair Unfair —1.78 —3.69 —2.73 —4.28 1:55. 


Note.—The mean evaluations in this table were computed on sums of ratings on three evaluative scales ranging from 3 to —3. 


Therefore the potential range of the means is 9 to —9, 


extreme than the means of the separate rat- 
ings (if 1.61 is considered positive and 1.00, 
negative); this resulted in a significant (F — 
30, df = 1/240, p < .01) rating by adjective 
pair interaction. 

The implication by rating interaction was 
significant (F = 7.6, df = 1/240, p < .01) 
and, as can be seen in the last column of Ta- 
ble 3, in the predicted direction. That is, for 
subjects rating tall-fair, the separate-combina- 
tion difference was —2.08 in the unfair-impli- 
cation condition (where a negative tall-fair 
relationship existed), but only —.37 in the 
fair-implication condition (where a positive 
tall-fair relationship existed). For subjects 
rating tall-unfair, the separate-combination 
difference was 3.72 in the fair-implication 
condition (where a negative tall-unfair rela- 
tionship existed), but only 1.55 in the unfair- 
implication condition (where a positive tall- 
unfair relationship existed). This interaction 
was not of significantly different magnitude 
in the tall-fair and tall-unfair rating condi- 
tions, as indicated by the absence of a signifi- 
cant third-order interaction. The active-pas- 
Sive rating data did not support either hy- 
pothesis, 

The implication main effect was also sig- 
nificant (5 < .01) for the evaluative rating 
data. Separate analyses indicated that this 
effect was due to significant (p < .01) differ- 
ences between fair- and unfair-implication 
conditions in the mean separate ratings as 
well as in the combination ratings. As in 
Study 2, the fact that mean separate ratings 
Were more extreme in the high-relation than 
in the low-relation conditions permitted the 
alternative hypothesis that a ceiling effect, not 


redundancy, caused the obtained implication 
by rating interaction. Arguing against this 
possibility is the finding that no ceiling effect 
was evident within high- and low-relation 
conditions in Study 2, and the unlikelihood 
that the mean combination rating (—4.28) 
of tall-unfair by the unfair-implication sub- 
jects in the present study was limited by a 
ceiling, since it was less than halfway from 
scale zero to —9, the lower limit of the scale. 

Finally, as would be expected, the evalua- 
tive ratings of tall-fair tended to be more 
positive (? < .01) than those of tall-unfair. 


DISCUSSION 


The impression formation data reported 
here are more easily explained by the present 
redundancy theory than by the somewhat 
simpler mean and summation theories. 

First, all three of the studies reported here 
produced support for the prediction that the 
evaluation of two adjectives in combination 
(Vaz) would be more extreme than the mean 
evaluation of those adjectives separately 
[(V4 + Vz)/2] for the case of negative ad- 
jectives, and the first and third studies sup- 
ported the same hypothesis for the case of 
positive adjectives, These results directly 
contradict any mean theory which implies 
that the evaluation of an impression is the 
average of the evaluations of the stimulus 
adjectives going into that impression. A sum- 
mation theory, on the other hand, would pre- 
dict such results (Fishbein & Hunter, 1964; 
Triandis & Fishbein, 1963). 

Second, all three studies supported the hy- 
pothesis that the relationship existing be- 
tween two adjectives going into an impres- 
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sion is related to the difference between the 
value of the impression and the mean value of 
the separate adjectives. All three studies pro- 
duced support for the case of negative ad- 
jectives, and the third study also produced 
support for the case of positive adjectives. 
Neither the mean nor the summation theory 
can handle this finding, since neither takes 
redundancy into account. 

These redundancy effects were largely ob- 
tained by comparing adjective pairs which 
differed in redundancy to an unusual extent. 
While these data do not permit sound con- 
clusions about the importance of more usual 
differences in redundancy, a study by An- 
derson (1962) seems to indicate that such 
differences may be of rather small importance. 
Anderson found that when subjects rated the 
likability of persons described by factorial 
sets of three adjectives, the pooled interac- 
tions (which would include any effects of 
differences in intraset redundancy) were sig- 
nificant for only 3 of the 12 subjects. 

Finally, the present model can lead to the 
prediction that the value of a combination 
will tend to be more extreme than the sum of 
the value of the two separate adjectives when 
the relationship between those adjectives is 
negative. The most relevant available data 
come from the third study, the only one 
where substantial negative relationships were 
present. And, in fact, the data for one of the 
two relevant conditions in Study 3 do seem 
to provide suggestive support for this predic- 
tion, The third row of Table 3 shows that in 
the tall-unfair rating, fair-implication condi- 
tion, the rating of tall alone was 4.72 units 
above zero and the rating of unfair was 2.72 
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units below it. A simple mean or summation 
theory would predict that the combination 
would be rated somewhere between the values 
of these separate adjectives, whereas in fact 
the combination was rated at exactly the same 
value as was “unfair” (—2.72). It is of course 
possible that the subjective midpoint of the 
scale was at or above the “tall” rating of 4.72, 
thus invalidating the present argument; how- 
ever, this seems unlikely since the subjects in 
Study 2 rated an "average" person at only 
3.28 on an evaluative scale identical to the 
one used in Study 3. 
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The paradigm known as decision under conditions of risk was employed to 
investigate the cognitive effects of a negative outcome occurring after a decision 
which alters the utility of the alternative chosen. 3 independent variables were 
manipulated: decision vs. no decision, probability of outcome at the time of 
decision, and obtained outcome. It was predicted that a negative outcome of 
the decision would result in dissonance reduction or attempts to justify the 
prior decision only when S perceived a definite possibility of this outcome at 
the time of decision. This hypothesis was confirmed in Ss' distortion of the un- 
pleasantness of a preparation undertaken as a consequence of their decisions 
and in their ratings of fear of an event for which they had prepared. A 
composite measure of dissonance reduction also showed the predicted relation- 
ship. No support for the hypothesis was obtained in Ss' evaluation of the 


experiment. 


An important class of decisions are those 
which must be made on the basis of a sub- 
jective estimate of the probability of the out- 
come. Consequently, at the time of decision, 
the individual is uncertain of the ultimate 
utility of his choice since its utility is de- 
pendent upon the outcome determined at some 
later time. This paradigm, decision under con- 
ditions of risk, has received much attention 
in decision theory, and various strategies have 
been formulated for making the optimal de- 
cision (see, e.g., Luce & Raiffa, 1957), but the 
psychological effects of such decisions have 
been relatively neglected. 

Consider the case where a person must de- 
cide whether or not he will prepare for a 
possible future event which may or may not 
occur, Obviously, if the event does occur, one 
would expect that the individual who had 
chosen to prepare would feel elation and con- 
Bratulate himself for having made a wise 
choice. If the event for which he had pre- 
pared does not occur, thus rendering his 
preparation useless, he should experience re- 
gret since, under this outcome, the decision 
not to prepare would have greater value. 
Festinger’s (1957, 1964) theory of cognitive 
dissonance would imply that, at least under 
certain circumstances, the person would at- 


1 This study was supported by funds from the In- 
stitute of Social Sciences, University of California, 
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tempt to reduce his regret, or dissonance, and 
justify his prior decision. 

It is a patent truism, however, that people 
do not always defend their decisions when 
faced with a negative outcome but that, in- 
stead, they often express open regret. Watts 
(1965) found that subjects who prepared of 
their own volition for a painful event which 
they thought was certain to occur found the 
preparation more unpleasant, evaluated the 
experiment more negatively, and were less 
willing to recommend participation to others 
when the event unexpectedly failed to occur 
than their counterparts who were given no 
choice in the matter. These results are op- 
posite to the usual relationship obtained be- 
tween choice and dissonance reduction, (See 
Brehm & Cohen, 1962, for a discussion of this 
topic.) 

It is suggested that the perceived proba- 
bility of occurrence of the event at the time 
of decision is an important variable mediating 
the cognitive effects of the decision by de- 
termining whether the individual feels a need 
to defend his prior act. This reasoning would 
imply that if an individual makes a logical 
decision to prepare for an event that is pre- 
sumably certain to occur, as in the Watts 
study, but which, due to a capricious uni- 
verse, fails to occur, dissonance reduction 
would not ensue even though the require- 
ments would appear to be formally met. 
Rather, more likely, frustration and anger 
would result. Indeed, Freedman (1963) dem- 
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onstrated that if justification for working on 
a tedious task was abrogated affer the task 
(thus constituting a fait accompli), subjects 
rated the task as less enjoyable than their 
counterparts who were given adequate justi- 
fication, whereas inadequate justification prior 
to beginning the task increased liking for it. 

Alternatively, if an individual decides to 
prepare for an event whose occurrence is 
clearly problematical, one would expect him 
to feel self-reproach for having acted rashly 
and to make strong attempts to justify the 
previous act when the event did not occur. 
This prediction is consistent with both the 
notion of inadequate justification (e.g., Fest- 
inger & Carlsmith, 1959; Freedman, 1963) 
and self-esteem theory (Deutsch, 1961; 
Deutsch, Krauss, & Rosenau, 1962; Deutsch 
& Solomon, 1959). It should be clear that, at 
least under the present conditions, the two 
are not independent; that is, a person who 
chooses to prepare with low justification (i.e., 
low probability of the event occurring) might 
be expected to feel his self-esteem threatened 
in terms of whether or not he had made an 
“intelligent” decision under the circumstances. 

In order to test this hypothesis, three in- 
dependent variables were manipulated in the 
context of a subject undergoing a very un- 
pleasant preparation for a possible painful 
experience. These were: choice versus no 
choice in the matter of preparation, high 
versus low probability of the event occurring, 
and whether or not the event occurred. An in- 
teraction effect was predicted so that subjects 
having high choice in preparation would mani- 
fest greater dissonance reduction than their 
low-choice counterparts when the preparation 
proved to be in vain only when the probability 
of the event occurring was low at the time 
of preparation. 

A number of possible modes of dissonance 
reduction available to our subjects were meas- 
ured as dependent variables. These included 
minimizing the unpleasantness of the prepara- 
tion, increasing anticipation of pain and fear, 
enhancing the scientific value of the experi- 
ment, and willingness to recruit other subjects. 

Distorting in memory the unpleasantness 
of the preparation would provide a direct 
means of reducing the dissonance experienced 
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by removing the major source of dissonance. 
Such distortions of the physical environment 
have previously been established as modes of 
dissonance reduction (Brock & Buss, 1962). 

Another way that a subject could justify 
his decision to undergo the unpleasant prepa- 
ration would be through exaggeration of his 
fear of the painful event. Convincing himself 
that it would have been dreadful and that he 
is particularly sensitive to pain supplies cog- 
nitions consonant with the preparation taken 
out of fear of the possible painful experience. 
While Festinger (1957) cites as anecdotal 
evidence of this phenomenon the studies of 
Murray (1933), Prasad (1950), and Sinha 
(1952), it has not, to the writer’s knowledge, 
been experimentally demonstrated. 

A more tangential means by which a sub- 
ject might reduce his dissonance would be 
that of convincing himself that he had suf- 
fered for a good cause as the experiment 
would make a real contribution to science. 
There is ample evidence that such changes in 
evaluation can serve as modes of dissonance 
reduction (e.g., Brehm & Cohen, 1959; Freed- 
man, 1963; Raven & Fishbein, 1961); but it 
is equally clear that, in the present situation, 
it would not as effectively justify the prepara- 
tion per se as the two methods previously 
discussed. 

Similarly, recruiting other subjects to take 
part in the experiment might provide a pos- 
sible means of dissonance reduction since 
persuading others that the experiment is valu- 
able and worth participating in would also 
supply cognitions consonant with the sub- 
ject’s plight. 

The subjects in this experiment can pre- 
sumably reduce their dissonance, or justify 
their actions, by any combination of these 
available modes; but it is not to be expected 
that each will receive equal usage or, indeed, 
any usage at all (Steiner & Johnson, 1964; 
Steiner & Rogers, 1963). Rather, it is to be 
expected that there will prove to be preferred 
modes of dissonance reduction (Brock, 1962; 
Brock & Buss, 1962; Pilisuk, 1962). How- 
ever, in our present state of knowledge there 
is little a priori basis for predicting any Pat 
ticular preference among the alternatives 1” 
the present study. 


: 


cal 


$ 


t 


COMMITMENT AND RISK 


METHOD 
General Procedure 


The experiment was represented to the subjects as 
a study of psychological factors related to taste. 
Upon entering the experimental room they were told, 
in a standardized introduction, that we were examin- 
ing this relationship by means of two procedures— 
actually tasting various liquids and through electro- 
physiological methods. 

It was explained that while the ideal arrangement 
would be to have each subject serve in both methods, 
this was impossible since serving in one tended to 
contaminate the person’s responses on the other. 
Hence, we were following the next best strategy of 
matching subjects in the two conditions on numerous 
characteristics so that they would be as much alike 
as possible. The instructions continued as follows: 


During the first part of the hour you will fill 
out a personal data sheet containing background 
information and a personality inventory. On the 
basis of your responses to these items you will be 
assigned to one technique or the other; that is, 
either to the electrophysiological method or tast- 
ing the various substances. At this moment I have 
no idea which of the two conditions you will take 
part in, but I can assure you that each is harmless. 

If you are assigned to the electrophysiological 
method I will stimulate various areas of your 
tongue with electrodes [the experimenter points to 
them] and record the resulting taste sensations. 
This instrument [the experimenter directs the sub- 
ject’s attention toward an electrical instrument] 
produces a low voltage which could not possibly 
harm anyone even if they had a serious heart con- 
dition. Unfortunately, however, the tongue is such 
a sensitive organ that even though the electric 
current is harmless, the stimulation is quite pain- 
ful. Therefore, we provide a mild type of local 
anesthetic which is simply rinsed around the tongue 
and somewhat desensitizes it. It doesn’t completely 
eliminate the pain, but it certainly does help to 
reduce it. 


At this point, the subject was asked to fill out a 
personal data sheet which included the premanipula- 
tion measures of the dependent variables. After the 
subject completed the “personal data” sheet, the ex- 
perimenter carefully scrutinized it and informed the 
subject that, on the basis of this preliminary in- 
formation, the chances would appear to be about 
50-50 (in the low-probability condition), or 95 out 
of 100 (in the high-probability condition) that he 
Would be assigned to the electrophysiological method. 
The experimenter continued: 


Of course, you understand that it will ultimately 
depend not only upon the information here, but 
upon your responses to the longer inventory that 
you will be asked to fill out next. 

As I said previously, the electrical stimulation is 
quite painful and because of this we provide an 
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oral anesthetic. Like most anesthetics it takes 
some time to take effect—for this particular one, 
about fifteen minutes. Since, unfortunately, so 
much time is required to complete this important 
information and since it was necessary to schedule 
subjects at hourly intervals due to the brief 
summer session, if one is to take the anesthetic and 
benefit from it, it is necessary to take it now so 
that it can be taking effect while you are filling 
out the rest of the material. I’m sorry that I can- 
not say with certainty whether, in your case, the 
anesthetic will be necessary or not; but, at this 
point, I simply do not know. 


The experimenter explained that the anesthetic is 
taken orally in two small doses spaced about 1 
minute apart. He stressed the importance of rinsing 
it thoroughly around the tongue and holding it in 
the mouth for some time before swallowing since it is 
this local contact which has the effect. Subjects were 
told that the anesthetic itself had a very unpleasant 
taste, and were given a very bitter and extremely 
unpleasant tasting solution of iron and quinine. 

Choice in taking this preparatory "anesthetic" was 
manipulated by the following instructions in the 
high-choice condition: 


We provide the anesthetic because the electrical 
stimulation is quite painful, but it is entirely up to 
the individual whether or not he takes it. You 
are completely free either to take the anesthetic 
or to go directly ahead with the experiment— 
whichever you choose. Would you prefer the anes- 
thetic or not? 


All subjects assigned to this choice condition chose 
to take the anesthetic. 

Subjects in the low-choice condition were given 
identical instructions up to the point of choice. Here 
they were simply given the anesthetic without any 
mention of their having a choice in the matter. Sub- 
jects were assigned randomly, within each probability 
condition, to each of two choice conditions. 

Having taken the preparatory "anesthetic," the 
subject was given the longer “personality” inventory 
which presumably was to determine his ultimate as- 
signment to one method or the other of studying 
taste. When the subject had completed this inventory, 
the experimenter feigned scoring it and carefully 
studied the subject’s “profile,” frequently referring 
to other *data sheets." After due consideration, the 
experimenter informed the subject that he was as- 
signed to the electrophysiological method (consonant 
condition) or taste method (dissonant condition), as 
the case might be, since he happened to perfectly 
match a subject who had participated in the other 
condition. Subjects within each of the probability 
and choice conditions were randomly assigned to the 
two outcomes. 

The experimenter then remarked that before pro- 
ceeding to the experimental treatment, he would like 
the subject to fill out one more brief form to provide 
some reliability data on the previous measures. The 
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subject then filled out the second form of the ques- 
tionnaire which was markedly different in format. 
When this was completed, the subject was ques- 
tioned concerning possible suspicions about the ex- 
periment, told the true purpose of the experiment, 
and asked not to discuss it with anyone. The evidence 
indicates that the subjects complied with this request. 


Subjects and Design 


All subjects participated in the experiment to ful- 
fill a requirement for educational psychology courses 
at the University of California, Berkeley. The total 
of 56 subjects was drawn from the summer session 
and was of an extremely heterogeneous composition 
ranging from 18 to 50 years of age with a mean age 
of 25 years. Sixty-eight percent were females. 

Four subjects refused to go on with the experi- 
ment after learning of its painful aspects. Since this 
subject attrition occurred prior to the experimental 
manipulations, it did not introduce a source of 
ambiguity in interpreting the results. 

The two levels of choice in taking the anesthetic 
(high and low), two levels of probability of the event 
occurring (.50 and .95), and the two outcomes (dis- 
sonant and consonant) constituted a three-factor de- 
sign of two levels each. Pre- and postmanipulation 
measures were obtained on all but one of the de- 
pendent variables so that each subject served as his 
own control, thus achieving greater statistical sensi- 
tivity. 

Subjects were randomly assigned to the eight ex- 
perimental conditions with the proportion of males 
to females approximately equal in the different cells 
and “run” individually. The amount of time required 
for each subject to go through the experiment was 
approximately 30 minutes. 


Materials 


Two questionnaires were used to measure the de- 
pendent variables and to provide a check on the 
manipulations of the independent variables. The 
“before” questionnaire consisted of a series of ques- 
tions designed to measure the subject’s anticipation 
of pain and fear of the electrical stimulation, his be- 
lief in the value of psychological research, and his 
willingness to recruit other subjects to take part in 
the experiment, The “after” questionnaire contained 
the same questions with rewording to past tense 
when appropriate. In addition, questions were asked 
concerning the perceived unpleasantness of taste of 
the “anesthetic” and the effectiveness of the manipu- 
lations of the independent variables. (All questions 
are included in the Results section according to the 
variable they were designed to measure.) Subjects 
responded to each of these questions on the basis of 
a 15-point scale labeled each third point. In all cases, 
1 indicated a low score on the item and 15 a high 
score. The scores on each of the questions relating 
to a particular dependent variable were summed to 
provide an overall score for that variable. For all 
variables measured both prior to and following the 
experimental manipulations, the statistical analyses 
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were performed on the difference scores between 
these two measures. 

The purported anesthetic was, in fact, a tonic con- 
taining 60 minims of tincture citro-chloride of iron, - 
8-60 grain of strychnine sulphate, and .25 gram of 
quinine hydrochloride per fluid ounce. 


RESULTS AND Discussion 
Effectiveness of the Manipulations 


The manipulation of choice was evaluated 
by a postexperimental question which asked, 
“How much choice did you have in taking | 
the anesthetic?” The overall means on a 15- | 
point scale, with 1 being “no choice,” were 
10.57 for subjects in the high-choice condi- 
tion, and 6.61 for subjects in the low-choice 
condition. Analysis of variance indicates that 
this difference in means is highly significant 
(F = 10.58, p < .01)? while all other com- 
parisons yield F ratios of less than 1. 

Subjects also responded to the question, — 
“At the time the anesthetic was offered to you, 
how likely did you think it was that you 
would be assigned to the electrophysiological 
method [1 “very unlikely” to 15, “very 
likely” ]?” While this labeling was necessary - 
to give the impression that there were all de- 
grees of likelihood, it did serve to restrict the l 
range of subjects’ responses since, in fact, the 
low likelihood was .50 which fell at the mid- 
point of the scale. In spite of this conserva- 
tism, the main effect of probability of oc 
currence was significant beyond the .01 level - 
(F = 7.69) with subjects perceiving greater 
or lesser likelihood of receiving the painful 
treatment in their respective conditions. The 
overall means obtained were 11.83 for the .95 
probability condition and 9.14 for the 50 
probability condition. No other effect in the 
table approached significance. Hence, it can 
be concluded that both manipulations weté 
quite satisfactory. f 

It should also be mentioned at this point 
that such a small percentage of subjects 
(25%) showed any change from the pre-to- 
post measures of the dependent variable 0 
willingness to recruit others that these change 
Scores were deemed too unreliable to warrant 
statistical analysis for differential effects 0'' 
the independent variables. 


? All estimates of significance levels reported 1 ; 
this study are based on two-tailed tests. 
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Effects of the Independent Variables 


Effects on perceived unpleasantness of the 
preparation. One obvious way that a subject 
could reduce the dissonance resulting from 
having needlessly undergone an unpleasant 
preparation would be through minimizing the 
unpleasantness of the preparation, thereby 
removing the major source of dissonance. In 
the postexperimental questionnaire, subjects 
were asked to rate how unpleasant they found 
the "anesthetic" on a 15-point scale ranging 
from “not at all unpleasant" at the low end 
to “very unpleasant" at the high end of the 
scale. The mean ratings for each of the 
eight conditions are presented in the cells of 
Table 1. 

The second-order interaction effect involv- 
ing choice, probability of outcome, and ob- 
tained outcome is significant (F = 4.06, p < 
.05) and in the predicted direction. That is, 
in the dissonant conditions, choice has a dif- 
ferential effect upon perceived unpleasantness 
of the preparation only in the .50 probability 
condition with those subjects having high 
choice stating that the preparation was Jess 
unpleasant than their counterparts having no 
Choice; whereas, in the .95 probability condi- 
tion, the mean ratings of unpleasantness for 
the two choice groups are virtually identical. 
In the consonant conditions one might also 
expect subjects who took the preparation of 


. their own volition to find it less unpleasant 


because of its utility (mitigating severe pain) 
and to have a tendency to congratulate them- 
selves for having made a wise decision. It can 


„be seen in the consonant cells of Table 1 that 


TABLE 1 


Supjects’ Mean RATINGS OF UNPLEASANTNESS OF 
TASTE OF THE PREPARATORY “ANESTHETIC 
(Hic Rares Inpicate HIGH PERCEIVED 


UNPLEASANTNESS) 
Condition 
Dissonant Nondissonant 


.50 95 50 95 


7.86 | 6.28 
6.57 | 9.86 


High choice 
Low choice 


6.86 | 8.28 
10.28 | 8.43 


1, ote.—Rach cell N in this table and following tables is 7 un- 
the otherwise specified. A different set of 7 subjects furnished 
the scores for each cell. 
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this is indeed the case in the .95 probability 
condition, but that the choice effect com- 
pletely washes out in the .50 probability con- 
dition—an exact reversal of the dissonant 
cells. This might be due to the fact that sub- 
jects in the .50 probability condition had 
reason to hope until the last minute that they 
would be spared the painful treatment. Hence, 
their reaction to learning that they had defi- 
nitely been assigned to it may have over- 
shadowed any effect of choice. 

While the consonant conditions were neces- 
sary to establish whether any obtained ef- 
fects were in fact due to the hypothesized 
relationship, rather than some function of 
making a choice under uncertainty and con- 
flict, it is also instructive to examine the 
dissonant conditions separately. It is apparent 
that the relationship obtained in these four 
cells is in the direction predicted by the 
theory, but this interaction falls short of 
statistical significance. However, the predicted 
difference between the high-choice and low- 
choice means in the .50 probability condition 
is sizable (6.86 versus 10.28) and approaches 
significance (t = 1.70, .05 < p <.10).8 

Effects on subjects’ ratings of fear and an- 
ticipated pain. One of the most interesting 
modes of dissonance reduction among those 
measured in the present experiment was that 
of exaggerating the perils of the electrical 
stimulation. Assuming the subject was in a 
dissonant state having rashly chosen the vile 
anesthetic in the face of low objective proba- 
bility of pain, he might reduce this dissonance 
in one way by convincing himself that he had 
a low tolerance of pain and was very fearful 
about having his tongue electrically stimu- 
lated, thus supplying cognitions consonant 
with his preparation. No such effect would 
be expected in the .95 probability condition 
since, in the face of such impressive odds of 
being subjected to the painful treatment, 
further justification for the decision would 
hardly be necessary. For those subjects in the 
consonant conditions who were assigned to the 
treatment for which they had prepared, dif- 


3 The significance of this difference and all other 
comparisons involving individual cell means were 
evaluated on the basis of the total within-cell vari- 
ance because of the relatively small number of ob- 
servations in each cell (Winer, 1962, pp. 207-210). 
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ferential changes in fear and anticipation of 
pain would not be expected. 

Fear of the electric shock was measured by 
two items: “How worried or fearful are you 
about receiving the electrical stimulation?” 
and “How dangerous do you think the use of 
electric shock is?” Three items pertained to 
the subject’s anticipation of pain: “How pain- 
ful do you expect the electrical stimulation to 
be?" “How painful have you found electric 
shock to be in the past (i.e., accidental shock 
through faulty wiring, etc.)?” and “How 
sensitive are you to painful experiences?" 

While this division between fear and pain 
might seem arbitrary, it was made on a priori 
grounds. A previous study (Watts, 1965) 
using the same items found that, perhaps 
contrary to intuition, fear and anticipation of 
pain did not vary concomitantly; but that, 
instead, ratings of fear increased markedly 
during the course of the experiment, while 
anticipation of pain decreased (5 < .01). In 
the present study, as well, subjects’ ratings of 
pain and fear diverged during the course of 
the experiment with ratings of fear increasing 
while anticipation of pain showed a mean de- 
crease from the pre-to-post measures (¢ = 
2.43, p = .02). Considering first the subjects’ 
ratings of fear, the initial means of this com- 
posite of two items and the mean changes 
from this initial position in each of the ex- 
perimental conditions are presented in Table 
2. Since the premanipulation means did not 
differ significantly from one another (F = .30, 
df = 7/48), the analyses were performed on 
the change scores. 

In the dissonant conditions, it is clear that 
the only cell showing any appreciable increase 
in ratings of fear is the high-choice, .50 prob- 
ability condition as predicted. The choice ef- 
fect in the .50 probability condition is sig- 
nificant beyond the .05 level (¢ = 2.38) with 
those subjects having a high degree of choice 
in taking the unpleasant preparation claim- 
ing that they are more afraid than their low- 
choice counterparts. In the .95 probability 
condition the trivial difference (£ = .84) be- 
tween high- and low-choice means is in the 
opposite direction. This interaction between 
probability of outcome and choice is signifi- 
cant at the .05 level (F — 5.18, df — 1/24), 

and strongly supports the hypothesis since 
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TABLE 2 


INITIAL MEANS AND MEAN CHANGES FROM THAT 
INITIAL LEVEL IN SUBJECTS’ RATINGS OF 
FEAR OF THE ELECTRICAL STIMULATION 


NS Condition 
"v Probability 
Ss Dissonant | Nondissonant 
anesthetic 
.50 .95 .50 95 

High choice 

Premanipulation M | 9.71 | 12.00 | 11.28 10.57 

M change 4-2.14| —.29 | --2.00| 4-43 
Low choice 

Premanipulation M | 10.71| 9.43 9.71] 8.43 

M change —.29 | +.57 | +1.00 | +1.00 


there is no “rational” basis of fear for any 
of those subjects assigned to the innocuous 
taste method. 

In contrast, all of the consonant conditions 
show a realistic increase in level of fear after 
having been assigned to the painful electro- 
physiological treatment. The overall mean in- 
crease of 1.11 points is significantly different 
from 0 at the .02 level (¢ = 2.56) and the 
mean changes recorded in each cell are quite 
homogeneous (F = .53 for treatments, df = 
3/24). It is evident, however, that even in 
the consonant conditions those subjects who 
took the preparation of their own volition 
when the probability was low claim they are 
more afraid, but this difference is much less 
pronounced in relation to the other three 
cells as indicated by the trivial F value for 
treatments. The somewhat higher level of 
fear in that cell can probably be attributed to 
the fact that, after making the decision and 
taking the “anesthetic,” the subject had ap- 
proximately 15 minutes to question the wis 
dom of his choice before learning the out- 
come. It is likely he partially convinced him- 
self that the treatment was indeed something 
to fear in order to justify his decision prior to 
learning the outcome and that once assigned 
to the painful treatment there was little rea- 
son for him to alter his belief. 

Analysis of variance of the consonant and 
dissonant conditions combined indicates that 
the Choice X Probability of outcome interac- 
tion effect is significant (F = 4.48, ? < 05), 


but this analysis provides no new information. ' 


No support was obtained for the prediction 


—— 
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probability of outcome, and obtained outcome 
is nil (F ~ 0.00). In fact, it can be seen in 
Table 3 that the only cell showing a sizable 
increase in evaluation of the experiment is the 
high-choice, .95 probability cell in the dis- 
sonant condition; and this increase of 3.14 
points is considerably greater than that ob- 
tained in its .50 probability counterpart. This 
represents a complete reversal of trend from 
that obtained in the modes of dissonance re- 
duction previously discussed, and is clearly in- 
consistent with theoretical expectations. 

Since, unlike the other modes of dissonance 
reduction that were measured, these items per- 
tained to satisfaction derived from the experi- 
ment, as well as its scientific value, the ob- 
tained results may be due, at least in part, 
to a surge of relief at the lucky escape from 
the painful treatment in the .95 probability 
condition which would not be experienced by 
those subjects in the .50 probability condi- 
tion who believed all along that they were 
equally likely not to receive the electrical 
stimulation. Those subjects in the low-choice, 
.95 probability condition who were “forced” 
to take the vile preparation only to learn 
that it was now useless may well have been 
left, both figuratively as well as literally, with 
a “bitter taste in their mouths" concerning the 
experiment. 

It is also possible that a ceiling effect is 
partially responsible for the smaller change 
in the .50 probability condition since this 
cell's premanipulation mean happened to be 
considerably higher than any other cell in the 
table and over 6 points higher than the 
.95 probability condition which showed the 
greater increase. Such an interpretation is not 
compelling, however, considering the possible 

INITIAL LEVEL IN SUBJECTS! EVALUATION range (4-60) and the fact that the pre- 

OF THE EXPERIMENT manipulation means did not differ significantly 

= from one another (F= 1.07, df = 7/48). 
d Effects on a composite measure of dis- 
Choices bability sonance reduction. When a number of possible 
in CN 
anesthetic Sg 


`, that subjects would increase their anticipation 
' of the painfulness of the treatment in order to 
œ. justify their preparation. Both the main effects 
and the interactions yielded F ratios of less 
than 1. 

Effects on subjects’ evaluation of the ex- 
periment. In addition to those modes of dis- 
sonance reduction previously discussed, a 

œ subject might attempt to regain cognitive bal- 
F ance by enhancing the scientific value of the 
| experiment in order to supply cognitions con- 
sonant with his predicament. Thus, a subject 
might convince himself that, in spite of his 
unhappy plight, his suffering had, after all, 
been for a good cause since research of this 
type added substantially to scientific knowl- 
edge. Evaluation of the experiment was meas- 
ured by four items: *How important a con- 
tribution do you feel psychological research 
makes to society?" *How valuable do you 
think this particular study is?” “How re- 
warding do you find it to take part in this 
study?” and “How interesting do you find 
this study?” The scores presented in the cells 
of Table 3 are the mean differences of this 
composite between the before and after meas- 
ures, 

The only significant effect in the table 
is the Choice X Outcome interaction (F = 
10.35, p < .01) which is in the direction pre- 
dicted from dissonance theory. That is, sub- 
jects taking the vile preparation of their own 
volition tended to evaluate the experiment in 
a positive direction after learning that the 


second-order interaction involving choice, 
TABLE 3 


INITIAL MEANS AND MEAN CHANGES FROM THAT 


Condition 


Dissonant Nondissonant 


50 95 50 95 


modes of dissonance reduction are presented 
to a subject, the question arises as to whether 
High choice 
remanipulation M 


Change 
ow choice 
Premanipulation M 
Change 


preparation was to no avail. The predicted 


32.00 
—1.86 


32.00 
—1.28 


39.86| 33.43| 29.71 
471 | +3.14| —2.71 


33.28 | 27.43| 30.86 
—243| —2.71| +.57 


he will utilize some one preferred mode at the 
expense of others, or whether, instead, he will 
utilize combinations of modes, or even all 
existing modes to some extent. Theoretically, 
if the subject manages successfully to reduce 
his dissonance through one of the various 
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possibilities, there would be little or no need 
for him to make use of the others. On the 
other hand, to the extent that one mode is 
being used, there is indication that the per- 
son is in a dissonant state and, therefore, 
might need other modes of reduction also. 

The data in the present study, like that 
of Steiner and his co-workers (Steiner & John- 
son, 1964; Steiner & Peters, 1958; Steiner & 
Rogers, 1963) tend to support the position 
stated first. Low nonsignificant correlations 
(ranging from +.13 to —.24) were obtained 
among the different modes of dissonance re- 
duction indicating that they were utilized by 
the subjects relatively independent of one 
another. When different subjects emphasize 
different means of dissonance reduction, it is 
not surprising if any one method, considered 
alone, fails to show a significant relationship 
to the independent variables. Consequently, 
some measure of a subject’s usage of all avail- 
able modes of dissonance reduction would 
provide a more reliable index with which to 
test the hypothesis. In the present study, a 
Subject could effectively justify his decision 
to take the vile preparation by one or any 
combination of the following modes: dis- 
tortion of unpleasantness of the preparation, 
exaggeration of fear of the treatment, and 
convincing himself that he is particularly 
sensitive to pain.t These three modes were 
combined by standardizing the subjects’ 
scores on each and giving every mode equal 
weight. The resulting standardized scores were 
summed across the three modes to provide one 
total measure of dissonance reduction for each 
subject. (The dependent variable of per- 
ceived taste was reflected since a low score 
on this variable implied dissonance reduc- 
tion.) 

Table 4 contains the means of this com- 


“The question which asked directly how painful 
the subject thought the electrophysiological stimula- 
tion would be was deleted as it completely failed to 
discriminate among the experimental conditions. The 
remaining two items pertaining to sensitivity to pain 
were included because the trends, while insignificant, 
were consistent with the others. 

The dependent variable of evaluation of the ex- 
periment was not included in the composite because 
it represented a complete reversal of trend, and there 
was some ambiguity about whether it really provided 
a means of justifying the preparation. 
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TABLE 4 


(CELL ENTRIES REPRESENT THE MEANS 
or Tuis COMPOSITE) 


Condition 


Dissonant Nondisson 


anesthetic 


50 


4-1.58| —.30 | +.62 
—1.13| —.42 | +.33 | <18 


High choice 
Low choice 


Note,s-A high number indicates greater dissonance r 
tion. The high values in the consonant cells are a function Of 
realistic increases in fear and anticipation of pain among 
subjects assigned to the painful treatment. ] 


posite for each experimental condition. 
second-order interaction effect is signifi 
(F = 4.68, p < .05) and closely follows 
predicted pattern of a greater choice e 
under the low-probability conditions for 
subjects who prepared in vain. 1 

Another way of looking at the data is i 
terms of only the four dissonant cells whe 
the theory would predict a Choice x Proba 
bility interaction in magnitude of disso 
reduction. The results strongly support 
prediction with the interaction effect be 
choice and probability yielding a signifi 
F ratio of 4.32 (p < .05, df = 1/24). 
is, choice has a significant effect in the 
probability condition (#= 3.08, p< Ol 
with those subjects having high choice 
festing greater indications of dissonance Te- 
duction whereas in the .95 probability condi 
tion, the effect of choice is nil (see Table 4): 

Hence, the analyses of the composite mea 
ure of a subject’s usage of available modes 0 
dissonance reduction offer strong support fo 
the hypothesis. 


CONCLUSIONS 


The present study examined the speci al 
case of a negative outcome occurring after a | 
decision had been made which altered. th 
utility of the prior decision. It was predic 
that subjects would seek to justify their de- | 
cisions only when this outcome had been pel 
ceived as a definite possibility at the time OF 
decision. 
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conditions in subjects’ ratings of fear of the 
event for which they had prepared and the 
significant second-order interaction effect in 
minimizing the unpleasantness of the prepa- 
ration undertaken closely follow the predicted 
pattern, Furthermore, and perhaps most con- 
vincing, a composite of these two measures 
plus the subjects’ perceived sensitivity to 
pain showed the predicted relationship. 
Clearly, the data concerning subjects’ evalu- 
ation of the experiment offer no support for 
the hypothesis. 

In general, these findings are quite con- 
sistent with self-esteem theory (Deutsch, 
1961; Deutsch et al., 1962; Deutsch & Solo- 
mon, 1959). 

There was some indication that magnifying 
fear of the event prepared for was a “pre- 
ferred mode" of dissonance reduction in that 
a stronger relationship was obtained with this 
dependent variable than with distortion of 
the unpleasantness of the preparation. Such a 
preference would be expected from Festinger’s 
(1957) proposal that cognitions referring to 
beliefs and opinions are more easily changed 
than those referring to behavior or percep- 
tions of the physical environment. It would 
seem reasonable to assume that the “cen- 
trality” of the belief involved would be of 
utmost importance in determining its position 
in this hierarchy of susceptibility to change. 
One would suspect that fear of the treatment 
would not be a very “central” belief and, 
consequently, should provide an efficient way 
of reducing dissonance in the given situation. 

In conclusion, the present data would sug- 


| gest that if a person perceives that he has 


made an irrational decision in a rational 
world, he will feel foolish, or dissonant, and 
attempt to defend his decision if it comes to 
naught. However, if he feels that he made a 
rational decision in an irrational world, disso- 
nance reduction will not result. 
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A REPUTATION TEST OF PERSONALITY INTEGRATION 


CAREY B. DUNCAN? 


Duke University Medical Center 


A reputation test designed to identify psychologically integrated individuals was 
administered to 31 social fraternities. Integrated individuals were compared with 
a contrast group along several dimensions. These results suggest that the psy- 
chologically integrated person is one: who has a positive self-concept, who 
perceives himself largely responsible for what happens to him, in whom the 
valuing process is internally generated, who has a wide range of interests and 
activities, and who is intellectually efficient. Hypotheses that he would perform 
in a more complex and creative manner were not supported in this investigation. 


Seeman (1959) pointed out the relative 
paucity of research and theory in the area 
of personality integration or optimal adjust- 
ment. The concept of organismic integration 
was suggested by him as a valuable theo- 
retical framework from which such research 
might proceed, Within this theory, an indi- 
vidual’s total behavior is organized in terms 
of a series of behavioral subsystems, such 
as the interpersonal subsystem, the cognitive 
subsystem, the physiological subsystem, etc. 
Personality integration is defined in terms of 
the quality of the interaction within these 
systems, One purpose of this study was to 
study the relationship between certain cogni- 
tive and interpersonal variables. 

The researcher who chooses to work in 
this field immediately becomes aware of the 
fact that a major problem centers around 
the selection of a research population. The 
second purpose of this investigation was, 
then, to design a selection technique which 
would be suitable for use with a college 
population, specifically constructed to permit 
the identification of psychologically inte- 
grated individuals, and easily replicated from 
one situation to another. 


1The author would like to express appreciation 
to Julius Seeman for helpful suggestions in all 
phases of this study. 

Study I was conducted while the author was at 
George Peabody College and was submitted as a 
dissertation in partial fulfillment of the requirements 
for the Doctor of Philosophy degree. Study II was 
conducted while the author was at Kansas State 
University. 


Reputation Test 


The Reputation Test technique is an eco- 
nomical one in most all respects; it may be 
characterized by its simplicity and directness. 
It has been used frequently in work with 
grade-school children, most notably by Tryon 
(1939). More recently, Wiggins and Winder 
(1961) constructed the Peer Nomination 
Inventory to measure adjustment in pre- 
adolescent boys. They found reputation test 
measures of dependency and aggression to 
be predictive of overt interpersonal behavior 
in fifth- and sixth-grade boys (Winder & 
Wiggins, 1964). More germane to the issue 
of personality integration is a study by Lewis 
(1959) which explored construct validity of 
a reputation test with elementary-school chil- 
dren, One of the constructs examined by him 
was social acceptability, He found the tech- 
nique to be a useful one for identifying chil- 
dren with certain positive behaviors. A num- 
ber of social psychology studies in small 
groups and leadership have utilized the peer 
nomination technique. Summaries of such 
research may be found in Cartwright and 
Zander (1960), Hare, Borgatta, and Bales 
(1955), and Hare (1962). / 

The purpose of the present reputation 
technique was to identify the person seen aS 
high in personality integration. The Person- 
ality Integration Reputation Test (PIRT) 
consists of seven items. Six of the items were 
derived from Jahoda’s (1958) extensive 
review of the literature pertaining to concepts 
of positive mental health, She grouped these 
concepts under six general headings. An item 


516 


PERSONALITY INTEGRATION 


was constructed to reflect the tenor of each 
of these categories. The seventh item was 
derived from a subheading of Jahoda’s which 
seemed behaviorally independent enough to 
warrant its conclusion as a separate item. 
The seven items are as follows: 


1. Who are the persons who seem best able to 
express their feelings without hurting the feelings 
of others? 

2. In your opinion who are the three persons in 
this group who seem to understand themselves 
best; that is, are aware of their shortcomings and 
strengths? 

3. Who are the ones who seem best able to 
keep an open mind and not jump to premature 
conclusions ? 

4. Who are the three persons who seem the most 
able to deal effectively with everyday tensions and 
anxieties ? 

5. Who are the ones who are most likely to stick 
by their own values and beliefs, even when these 
may be somewhat unpopular? 

6. Which three persons seem capable of forming 
deeper and more profound relationships with others 
and seem to be genuinely concerned with other 
people? 

7. Which persons seem to you to have been the 
most successful in all phases of their life: social, 
personal, educational, etc.? 


When the test was administered to a group 
with instructions to nominate three separate 
persons for each item, the distribution ob- 
tained was a highly negatively skewed one, 
permitting the selection of the few individuals 
who received a large number of nominations. 
These persons are identified as integrated 
personalities. 

Reliability estimates for the test have been 
obtained by both the split-half and test- 
retest methods. Split-half reliability was esti- 
mated with the Spearman rank-order coef- 
ficient (Siegel, 1956) for three groups to 
Which the test was administered. The ob- 
tained coefficients were .82, .78, and .85. 
Test-retest rankings were similarly com- 
pared; the obtained coefficient was .88. These 
correlations compare favorably with those 
reported by Gronlund (1959). In the con- 
sideration of reliability it should be noted 
that the focus of concern in this investiga- 
tion was upon high scoring individuals. The 
Correlations reported above reflect the stabil- 
ity of the ¢otal group. In no instance would 
there have been a change in individuals 
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selected for the psychologically integrated 
group. 

The nature of these data render the usual 
intercorrelational techniques somewhat mean- 
ingless. Kendall coefficients of concordance 
(Siegel, 1956) were computed for several 
groups to which the PIRT was administered, 
and the nature of these was consistently 
about .50. However, these coefficients reflect 
the considerable lack of consistency evident 
in the scores of low PIRT status individuals. 
For example, in one instance the difference 
between the individual's name being men- 
tioned once on an item as opposed to not 
at all resulted in a rank difference of 18.5 
and 38. Such large differences were not at 
all uncommon, 

Some idea of the consistency of the seven 
PIRT items with reference to the population 
of integrated personalities may be gained by 
examining the percentage of the total score 
represented by each question. The average 
of such percentages for the 59 integrated 
subjects used in this study follow, and the 
range of percentages for each item is indi- 
cated parenthetically: Item 1, 19% (8- 
30%); Item 2, 13% (2-23%); Item 3, 15% 
(6-28%); Item 4, 14% (4-24%); Item 5, 
6% (0-26%); Item 6, 14% (2-26%); and 
Item 7, 19% (2-35%). 

An examination of the distributions ob- 
tained with the reputation test revealed that 
integrated individuals were rather consist- 
ently not nominated on Question 5 (“Who 
are the ones who are most likely to stick by 
their own values and beliefs, even when these 
may be somewhat unpopular?"). It was 
also apparent that those individuals who did 
receive nominations on this question were 
rarely nominated with frequency on the other 
items. This item represents a rewording of a 
similar item used in pilot work where the 
same inconsistency occurred. Because of the 
lack of correlation of this item with the 
total score, it was deleted; and the other six 
items were summed to yield the total score. 

Further, with regard to the percentages 
reported above, it should be noted that no 
single item is so powerful that all integrated 
subjects received a majority of the nomina- 
tions on that item, Thus, the use of no single 
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question would result in the selection of the 
same population. 


Hypotheses 


Many theorists, most notably Rogers 
(1961), have characterized the integrated 
person as one who has a positive self-concept. 
There is a body of research literature (e.g., 
Lipkin, 1948; Raimy, 1948; Seeman, 1949) 
which suggests that as therapy progresses the 
number of positive statements made about 
the self increases. The first hypothesis of this 
study was, therefore, that the integrated 
group would demonstrate a more positive 
self-concept than a contrast group. 

Recent research in the area of locus of 
control has suggested that persons with an 
internal locus of control are more effective 
in the exertion of social influence (Phares, 
1965); that they have more knowledge about 
important environmental factors which have 
personal implications (Seeman & Evans, 
1962); that they are less likely to go along 
with a group judgment which they know to 
be incorrect (Crowne & Liverant, 1963); etc. 
It was hypothesized, then, that the psycho- 
logically integrated group would demonstrate 
an internal locus of control to a greater extent 
than the contrast group. 

Theorists such as Rogers (1961) and 
Reisman, Glazer, and Denny (1953) have 
considered the importance of an internalized 
value system in personality integration. It 
was hypothesized here that the integrated 
group would demonstrate an internal locus 
of evaluation to a greater extent than the 
contrast group. 

Seeman (1963) found that the child who 
has a keen awareness of and interest in his 
environment is rated high in adjustment by 
teachers, Many personality theorists (e.g., 
Lewin, 1935) have dealt with the importance 
of environmental contact in adjustment. It 
was hypothesized, therefore, that the inte- 
grated group would be characterized by a 
greater degree of environmental contact than 
the contrast group. 

The fifth hypothesis, that the integrated 
group would demonstrate a greater capacity 
for novel behavior than the contrast group, 
was suggested by the theoretical writing of 
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Andrews (1961), Szurek (1959), and others 
who suggest that creativity’ and ‘personality 
integration are highly correlated. Such a rela- 
tionship has been found in studies reported 
by Barron (1963). 

Rokeach (1960) discussed the interaction 
between a person’s cognitive functioning and 
his emotional functioning. His research has 
suggested these to be but two different facets 
of a person’s total behavior. Additionally, 
there has been a considerable amount of 
research in recent years relating cognitive 
functioning to various personality variables 
(e.g., Witkin, Lewis, Hertzman, Machover, 
Meissner, & Wapner, 1954). Some of the 
research on therapy process reported by 
Rogers (1961) suggests a correlation between 
cognitive complexity and personality integra- 
tion. It was, therefore, hypothesized that the 
integrated group would demonstrate cognitive 
complexity to a greater extent than the 
contrast group. 

Campbell and Fiske (1959) discussed the 
notion of discriminate validity, emphasizing 
that a test should not correlate too highly 
with measures from which it should differ. 
It was deemed necessary here to rule out 
intelligence with regard to status on the 
PIRT. It was, therefore, hypothesized that 
the integrated and contrast groups would not 
differ with regard to intelligence. 

Intellectual efficiency has been posited by 
many as a concomitant of personality inte- 
gration; and, indeed, Barron (1954) has 
found such a relationship. The final hypothe- 
sis of this investigation was, then, that the 
integrated group would demonstrate a greater 
degree of intellectual efficiency than the 
contrast group. 


METHOD 
Population 


The decision was made to restrict the population 
to males on the basis that personality integration 
probably has somewhat different meanings for each 
sex. Social fraternities provide large numbers of 
men who have had contact with one another over 
a period of time in a variety of situations. These 
seemed, therefore, to be appropriate groups to which 
the PIRT might be administered. 3 

The study was first conducted with a population 
which consisted of the active members of 12 soci 
fraternities at a private southern university. It was 
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then replicated’ with a population of 19 fraternities 
at a large’ midwestern state university. The results 
of both investigations will be reported here, data 
from the first study being identified as Study I, 
that from the second, as Study II. A total of 454 
persons in Study I and 663 persons in Study II 
were administered the PIRT with the following 
instructions: 


On the following pages are some questions 
which deal with certain kinds of actions, For each 
question you will be asked to nominate the three 
persons who are members of your fraternity who 
seem to exemplify this behavior more than others. 
In making your decisions about whom you will 
nominate for each question, try to think back 
and recall actual instances when the person dis- 
played the described behavior. These questions 
are not trying to discover the most popular 
members of your fraternity; so try to eliminate 
that concept in making your decisions. 

You should nominate three separate persons 
for each question. However, you may use the 
same name as many times as you like on different 
questions. The order in which you list the names 
is unimportant, 


The results of these nominations were tallied, 
and those persons who were nominated consistently 
by their peers and whose total scores were signifi- 
cantly at the top of the distribution were selected 
as the integrated group. The Poisson distribution 
(Feller, 1950) permits a theoretical statement of 
the probability associated with the receipt of a 
given number of nominations for any one indi- 
vidual. The probabilities associated with the nomina- 
tions for each member of the integrated group 
were all beyond the .0001 level. 

In the selection of a contrast group the factor 
of class standing (senior, junior, or sophomore) 
was considered, and the groups were matched on 
this variable. This is the only variable which was 
considered in the selection of this group. Reputation 
test status was ignored so that contrast group mem- 
bers could, and did, have relatively high, low, or 
average reputation test scores. Thus, from within 
the same fraternity population, matched for class 
standing, a contrast group of equal number was 
selected. In Study I these procedures yielded two 
Broups of 15 seniors, 7 juniors, and 3 sophomores 
each. In Study II two groups of 25 seniors, 9 
Juniors, and 4 sophomores each were identified. 

The participation of subjects in the remainder 
of the investigation was on a voluntary basis. In 
Study I 25 integrated individuals were identified 
by the PIRT; 22 of these agreed to participate. 
Twenty-six contrast subjects were contacted in order 
to provide a group of equal number. In Study II 
38 integrated individuals were identified, and 37 
of these participated. Forty-four contrast subjects 
Were contacted to provide a group of equal number. 
Thus, in Study I these procedures permitted the 
Selection of two groups of 22 individuals each; in 
Study TI there were 37 individuals in each group. 
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Procedure 


Subjects who participated in the experiment were 
requested to take five tests under group conditions. 
No information was given them concerning their 
status on the PIRT; they were, however, aware 
that there was a connection between the two events. 
A brief description of each of the five tests follows: 

Self-Rating Scale. This is an eight-item scale 
constructed by the author. The first seven items 
were the same items found on the PIRT, but 
changed to declarative statements in the first-person 
singular. Subjects rated themselves from 1 to 8 on 
each of these items as to the extent to which the 
item was characteristic of himself. The theoretical 
range of scores on this part of the scale was from 
7 to 56, The final item consisted of the statement, 
“Tn general, how would you characterize your own 
over-all adjustment?” Again, each subject rated 
himself from 1 to 8 in response to this item, and 
the score was referred to as the Adjustment score, 
A test-retest reliability coefficient of .91 was ob- 
tained with this instrument. In Study II the 
Tennessee Self Concept. Scale (Fitts, 1964) was 
also administered. 

Social Reaction Inventory. Liverant and Scodel 
(1960) developed the Social Reaction Inventory 
to provide a measure of locus of control, Cromwell 
(1963) has defined locus of control as a psycho- 
logical construct describing the degree of predisposi- 
tion within an individual toward the belief that 
events in life are haphazard, the result of chance, 
or controlled by other people. The construct is 
viewed as a continuum, the polar points of which 
may be labeled internal locus of control and 
external locus of control. Accordingly, the person 
who feels that he can exert control over what 
happens to himself would be described as having 
an internal locus of control. 

In Study I the Inventory consisted of 60 items 
arranged in a forced-choice scale. Subjects were 
instructed to choose one of two alternatives for each 
item which more strongly corresponded to their 
beliefs. The test was scored in the internal direction, 
a high score indicating an internal locus of control. 
Twenty additional “buffer” items were added to 
the Inventory in an effort to disguise the purpose 
of the test. Test-retest reliability of .93 with 40 
male college subjects was reported by Liverant. 
In Study II a 23-item, factor-analyzed version of 
the same Inventory was administered. 

ALOE Scale. The Adult Locus of Evaluation 
(ALOE) Scale is a 24-item scale devised by Miller.? 
He described locus of evaluation as a continuous 
variable, the extreme points of which are labeled 
internal locus of evaluation and external locus of 
evaluation. Thus, the person for whom the valuing 
process lies, within himself would be described as 
having an internal locus of evaluation. On this Scale 
subjects read each of the 24 items and then indicated 
by marking “Yes” or “No” whether they felt the 
statement to be applicable to themselves. This Scale 
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was also scored in the internal direction. The author 
obtained a test-retest reliability coefficient of .84 
with this instrument. 

Activity Questionnaire. On this questionnaire sub- 
jects were asked to list extracurricular activities, 
honors and offices held, hobbies, interest areas, and 
leisure time activities. The Questionnaire was con- 
structed by the author to provide information con- 
cerning the environmental contact hypothesis. A 
“score” was obtained with the instrument by count- 
ing the number of independent items listed by each 
subject. A test-retest reliability coefficient of .82 
was obtained. 

Circles Test, On this test, described by Torrance 
(1960) subjects were given sheets on which there 
were 15 blank circles. The instructions were to 
*...see how many objects you can make from 
these circles, using the circle as a basis for the 
design,” In Study I a 10-minute time limit was 
imposed; in Study II the time limit was 4 minutes. 
Subjects were allowed to use as many sheets of 
circles as they wished. 

Three scores were obtained with this instrument. 
A productivity score was derived by counting the 
number of circles used by each subject. The com- 
plexity score was based upon the extent of elabora- 
tion of the design and scored in accordance with 
a 5-point scale devised by Collins. 
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Comparisons of integrated and contrast 
group means on the several dependent vari- 
ables are presented in Table 1. In each case 
the significance of the difference between the 
two group means was tested by the / statistic 
for two independent samples (Walker & Lev, 
1953). Tests for homogeneity of variance 
were accomplished by means of the Hartley 
Fmax Statistic (Winer, 1962). This assump- 
tion was violated in only one case, the origi- 
nality score (9% or less) in Study I. Boneau 
(1960) has indicated, however, that the ef- 
fects of violation of this assumption upon the 
t statistic are minimal where, as in this case, 
sample sizes are equal. 

The hypothesis that the integrated group 
would demonstrate a more positive self- 
concept than the contrast group was sub- 
jected to two tests. The first of these utilized 
the score derived from the first seven items 
on the Self-Rating Scale; the second test 
was made from scores on Item 8 of the same 
instrument. The results of both tests were 
seen as providing support for the hypothesis. 
It is of further interest to note that if the 
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TABLE 1 


Comparison OF INTEGRATED (I) AND Contrast (C) 
GROUPS ON DEPENDENT VARIABLES 


M Sigma 
t 
I c I [e 

Self-Rating Scale 

Study I 46 |41 | 3.6) 5.7|3.4* 

Study II 46 |42 | 44) 45 43* 
Adjustment Rating 

Study I 7.0| 60| .7 | 12|34* 

Study II 74| 615 .7 | 11]48* 
Social Reaction Inventory 

Study I 45. |40 | 7.5 | 60|2.4* 

Study II 18 |15 |2.9 | 2.8| 3:3* 
ALOE Scale 

Study I 19 |16 |3.5| 3.2] 3.3* 

Study II 19 |17 |2.3 | 2.8|3.8* 
Activity Questionnaire 

Study I 22 |16 | 4.8 | 4.6} 4.6* 

Study IT 27 |19 | 8.0| 57|49* 
Originality* (9% or less) 

Study I APELS | -- 50] 49 NN 

Study IT 12|15| .7| .5|12 
Originality® (4% or less) 

Study I 1.5) 15| 3 | SIS 

Study II 14| 1.2] 6| .6|12 
Productivity (Number of 

circles) 

Study I 23 |24 | 8.0 |112| 4 

Study II 11 |10 | 4.5} 3.3) 11 
Productivity (Number of 

designs) 

Study I 19 |17 | 7.3} 5.0} 11 
d Study II 9 | 8 | 40) 33/14 

omplexity score 

Study I 24| 26| 3| 4/16 

Study II 25124) 6 |. Slat 
GPA " 

Study I 19| 13| 4 5 AT 

Study II 2.7| 22| .6 6 4.1 


a Scores are arc-sine transformation of percentages. 
*p «.05. 


Self-Rating Scale data are analyzed without 
including Question 5 (which corresponds to 
Question 5 on the PIRT) the same signifi- 
cant relationship obtains. Further, if the two 
groups are compared on Question 5 alone, the 
integrated group mean is slightly higher, but 
not significantly so. 

The Tennessee Self Concept Scale (Fitts, 
1964) was administered to the subjects 1 
Study II. These data, which are not sum: 
marized in Table 1, would suggest that both 
the integrated and contrast groups appear to 
be above average in terms of personality 
integration when compared with Fitts’ norm 
group. This is based upon the fact that the 
differences on almost ail of the 29 variables 
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measured on this Scale are in the same 
direction in which Fitts’ personality integra- 


‘tion group differs from his norm group. In 


comparing the integrated and contrast groups 
the means did not differ significantly for any 
variable, However, the direction of the dif- 
ferences was predictable for 22 of the 29 
variables, a prediction significant at less than 
the .01 level. 

Hypotheses that the integrated group 
would demonstrate an internal locus of con- 
trol and an internal locus of evaluation to a 
greater extent than the contrast group were 
also accorded support by the data, as was 
the hypothesis that the integrated group 
would be characterized by a higher degree 
of environmental contact. 

It was hypothesized that the integrated 
group would demonstrate a greater capacity 
for novel behavior than the contrast group, 
and this was tested by the originality score 
derived from the Circles Test. This score was 
expressed as a percentage of original re- 
sponses, the criterion being a design which 
appeared less than 10% of the time. The 
results were nonsignificant. The data were 
then analyzed using a more stringent cri- 
terion of originality; those responses which 
appeared less than 5% of the time were 
considered originals, Once more, the results 
were nonsignificant. In both cases the per- 
centages were converted by means of the 


* arc-sine transformation (Winer, 1962) before 


computations were carried out. 

Two tests were made of the hypothesis 
that the integrated group would demonstrate 
cognitive complexity to a greater extent than 
the contrast group. The first of these utilized 
the productivity score derived from the 
Circles Test, and the results were nonsignifi- 
cant. On a post hoc basis it seemed that a 
More reasonable score to use as a measure 
of productivity was the number of designs 
Produced by each subject. However, a test 
of the significance of the difference of the 
two means based upon this criterion was also 
nonsignificant, 

The second test of the “cognitive com- 
plexity hypothesis” was based on the com- 
plexity score derived from the Circles Test. 
Three judges rated each design according to 
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Collins’? criteria. Interjudge reliability was 
estimated by Kendall’s coefficient of con- 
cordance (Siegel, 1956), and the obtained 
coefficient was .90 in Study I, .86 in Study II. 
Judges ratings were summed and averaged 
and a mean complexity score was then ob- 
tained for each subject. This test of cognitive 
complexity was also nonsignificant, and the 
hypothesis that the two groups differed on 
this variable was rejected. 

To test the hypothesis that integrated sub- 
jects would demonstrate greater intellectual 
efficiency than the contrast subjects, cumu- 
lative grade-point averages (GPAs) were ob- 
tained for all subjects. The data in Table 1 
would support the notion that the academic 
achievement of integrated subjects was 
superior to that of the contrast subjects. As 
control measures, in Study I the verbal and 
mathematical scales of the College Entrance 
Examination Boards (CEEB), taken by sub- 
jects at the time of college entrance, were 
used as measure of scholastic aptitude. This 
variable is highly related to intelligence, 
These scores were Available for 20 of the 
integrated subjects and for 21 of the contrast 
group. In Study II quantitative, language, 
and total scores for the American Council on 
Education Test of Intelligence (ACE), given 
at the time of admission, were available 
for 23 of the integrated group and 18 of the 
contrast group. These control data are sum- 
marized in Table 2 and they justify ac- 
ceptance of the null hypothesis that the 
two groups do not differ with regard to 
“intelligence.” 

To specify further the relationship between 
the intellectual and achievement variables 


TABLE 2 


COMPARISON OF INTEGRATED (I) anp Contrast (C) 
Groups ON CONTROL MEASURES 


M Sigma 


CEEB 
Verbal, Study I 529 | 502 | 78 | 91 | 1.0 
Quantitative, Study I |588|560| 78 | 83 | 1.1] 
ACE " 
Verbal, Study II 69| 62| 13 | 12 | 20 
Quantitative, Study IT 51| 47| 10 | 11 | 1.1 
Total, Study II 120 | 109 | 21 | 22 | 1.7 
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and PIRT status, biserial correlations were 
computed for the data in Study I. The cor- 
relation between PIRT status and the verbal 
score of the CEEB was .20; between PIRT 
status and the mathematical score of the 
CEEB a correlation of .22 was obtained; 
and the correlation between GPA and PIRT 
status was .77. These results indicate that 
intellectual capacity has a negligible rela- 
tionship to PIRT status, but that the indi- 
viduals high in personality integration are 
efficient in the use of their intelligence, 


Discussion 


Evidence for the validity of the PIRT was 
provided by the support accorded six of the 
eight hypotheses in this investigation. It may, 
then, be concluded that the PIRT is a useful 
research tool for the selection of integrated 
individuals. In this regard, it should be em- 
phasized that the contrast group was not 
made up exclusively of individuals who 
scored low on the PIRT. In fact, the entire 
population in this study is a fairly select one. 
All of the subjects were achieving success- 
fully in academically respected universities; 
all but five were upper classmen. These fac- 
tors make the achievement of significant dif- 
ferences between the two groups an even 
more striking fact. From the standpoint of 
reliability, the PIRT is both internally con- 
sistent (split-half correlations) and stable 
over time (test-retest correlations). 

The variables studied here cut across 
Seeman’s (1959) cognitive and interpersonal 
subsystems; so that, concomitantly, there is 
support for his theory of organismic integra- 
tion. His notion that the integrated person 
differs in measurable ways from his “normal” 
or "average" peers is clearly reinforced. This 
does appear to be a fruitful and utilitarian 
framework from which research in personality 
integration might proceed. 

It is important to note that these variables 
do not operate in a mutually exclusive 
fashion. This is perhaps best illustrated by 
a consideration of the environmental con- 
tact variable. Within the integrated group 
there are many individuals who are highly 
visible on campus (e.g., fraternity presidents, 
varsity athletes, student body officers, etc.). 
However, there are as many, if not more, of 
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these campus “stars” who did not fall into 


this group. Thus, while the integrated person \- 


is likely to be active and relatively promi- 
nent in campus affairs, the obverse of this 
statement is mot true. The same thing obtains 
for each variable considered in this study, 
There would, then, be obvious pitfalls in 
attempts to conduct research from the other 
side of the fence. One cannot, for example, 
conclude that the person who makes highly 
efficient use of his intellectual capabilities 
is an integrated person. This factor was 


strikingly illustrated when the data were | 


combined across the two universities and inte- 
grated and contrast groups and total means 
computed. Only one integrated subject scored 
above the mean on each variable, and no 
contrast subjects made scores consistently 
below the mean. 


There is an alternative explanation which 1 


needed to be considered with regard to the 
"intellectual efficiency hypothesis." It could 
be postulated that the same techniques which 
contribute to their interpersonal success made 


them, also, effective “grade-getters.” How- | 


ever, there is evidence to refute this. In 
Study I, in an effort to obtain some faculty 
ratings of the subjects, each subject was 
asked to list the three faculty or staff mem- 


bers with whom he had had the greatest | 


amount of contact, Rating scales were distrib- 
uted to these persons. If they were return 
at all, it was usually with a note that these 
people were not personally known. Thus, 
interpersonal effectiveness can be ruled out as 
an explanation of the “intellectual efficiency 
hypothesis.” B. 
Three separate scores were obtained with 
the Circles Test; in Study I none of them 
yielded a significant difference. It was felt 


that the 10-minute time limit was too longi - 


most subjects had apparently exhausted their 
repertoire of responses to the test and the 
results were nondiscriminating. Though the 
time limit was shortened to 4 minutes Jh 
Study II, significant results were still not 
obtained. It is felt that this test is simply 


not stringent enough to assess creativity | 


adequately within this population. : 
There are many avenues of further investi 

gation with this instrument which need to be 

followed. For example, there is a need 
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study these individuals in carefully controlled 
‘behavioral situations. Some pilot work now 

* being conducted by the author has suggested 
that integrated persons demonstrate certain 
characteristic behaviors in small group situa- 
tions and that small groups may be a profita- 
ble vehicle for such research. Also, it would 
be of interest to know what similarities and 
differences might be found when a similar 
female population is used. This project is 
also now underway. 

Up to this point the author's concern has 
been only with high scoring individuals. Re- 
search designed to shed light on the correlates 
of reputation test status of other than high 
scoring individuals would be of interest. One 
line of investigation of considerable potential 

concerns those individuals who score rela- 
tively high on Question 5 and relatively low 
on other questions. An examination of the 
distributions of PIRT scores indicates that 
such individuals can readily be found. The 
intent of Question 5 was to identify those 
persons who are free to be nonconforming 
when they feel it is appropriate, but who 
are not compelled to be nonconforming. 
However, it would appear that this type of 
person was not one who was nominated. In 
this regard it is interesting to note that inte- 
grated persons tended to rate themselves 
highly on the Self-Rating Scale (though not 
significantly higher than the contrast group 
did). Thus, while the integrated person's 
peers tended not to nominate him on Ques- 
tion 5, he is likely to rate himself highly 
on this question. 

The results of this study contribute to the 
growing network of data which describes the 
integrated personality (e.g., Barron, 1963; 
Seeman, 1963). This investigation suggests 
that his peers tend to make very positive 
Statements about him and that he is likely 
to make the same positive statements about 
himself, Probabilities of success or failure in 
What he attempts are likely to be evaluated 
in terms of his own capabilities, rather than 
n terms of chance factors such as "fate" or 

preordainment." Further, he believes himself 
to be the best judge of his own behavior. 
He is less likely than his peers to need the 
Stamp of approval on his actions. He is a 
Person of high environmental contact; a per- 
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son with broad interests; and he is likely to 
involve himself in more activities than are 
his peers. While he does not differ from his 
peers in terms of scholastic aptitude, he is 
likely to make more effective use of his 
intellectual abilities. 
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88 male university students were either angered or treated in a neutral fashion 
by E’s accomplice who earlier had been introduced either as “Kirk” or “Bob.” 
Following a 2 X 2 X2 factorial design, Ss then saw either a 7-min. prize-fight 
scene, in which the actor, Kirk Douglas, received a beating, or an equally long 
exciting movie about a track race. Finally, all Ss were given a socially sanc- 
tioned opportunity to administer electric shocks to the accomplice. The greatest 
number of shocks were sent by the angered men who had witnessed the prize 
fight and who had been informed that the accomplice’s name was Kirk, The 
Jatter’s name-mediated association with the witnessed aggression had ap- 
parently heightened his cue value for aggressive responses causing him to evoke 
the strongest volume of aggression from the men who were ready to act 


aggressively. 


According to a number of experiments, the 
display of aggression on the movie or tele- 
vision screen is more likely to increase than 
reduce the probability of aggressive behavior 
by members of the audience (Bandura & Wal- 
ters, 1963; Berkowitz, 1962, 1964, 1965; 
Berkowitz & Rawlings, 1963; Walters, 
Thomas, & Acker, 1962). This heightened 
likelihood of aggression is not always ap- 
parent, however. If the audience regards the 
depicted aggression as being unwarranted or 
morally wrong, inhibitions will be. aroused. 
Such restraints against hostility can weaken 
the intensity of the aggressive actions shown 
by the audience members, or may even cause 
them to avoid displaying any overt hostility 
at all (Berkowitz & Rawlings, 1963). 

There is some question as to what the 
specific role of the witnessed hostility is. Ban- 
dura and Walters (1963) generally prefer 
to emphasize two processes in accounting for 
film-engendered aggression: imitative learning 
and inhibitory and disinhibitory effects (cf. 
P. 60). By watching the actions of another 
Person, they state, “the observer may acquire 
new responses that did not previously exist in 
his repertory.” In addition, the observed 
model’s behavior may also either arouse or 
weaken the audience's inhibitions against 


lThis study was carried out by RGG under LB's 
Supervision as part of a project sponsored by 
Grant G-23988 from the National Science Foundation 
to the senior author. 


particular actions. Thus, according to this 
analysis, witnessed hostility presumably gives 
rise to a persistent action tendency, a readi- 
ness to display aggression toward anyone. If 
certain persons are attacked rather than other 
people, the former süpposedly have produced 
a disinhibition against aggression, for ex- 
ample, by somehow reminding the observer 
that hostility toward these people is permis- 
sible. (As the reader will recognize, the Ban- 
dura-Walters analysis is reminiscent of the 
classic scapegoat theory of prejudice. This 
latter doctrine also contends that the frus- 
trated or prejudiced person is ready to attack 
just anyone and aggresses against those groups 
who are visible and safe to attack.) 

But while modeling and inhibition effects 
are undoubtedly important, filmed violence 
may also serve to elicit aggressive responses 
from the observer (Berkowitz, 1962, 1964). 
The depicted aggression may increase the 
probability of attacks upon particular targets, 
depending upon the aggression-evoking cue 
properties of these objects. Observed aggres- 
sion presumably is likely to have aggressive 
consequences as a function of: the strength of 
the observer's previously acquired aggressive- 
ness habits; the association between the wit- 
nessed event and both the situations in which 
the observer had learned to act aggressively, 
and the postobservation setting; and the in- 
tensity of the guilt and/or aggression-anxiety 
also aroused by the observed violence (Berko- 
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witz, 1962, p. 238). Putting it simply, this 
reasoning implies that the aggressiveness 
habits activated by witnessed hostility are 
often only in *low gear," so to speak. Other 
appropriate, aggression-evoking cues must be 
present before the observed violence can lead 
to strong aggressive responses by the ob- 
server. These cues are stimuli in the post- 
observation situation which have some as- 
sociation with the depicted event, or which 
may be connected with previous aggression- 
instigating situations. Thus, a person who sees 
a brutal fight may not himself display any 
detectable aggression immediately afterwards, 
even if his inhibitions are relatively weak, 
unless he encounters stimuli having some 
association with the fight. (Returning to the 
problem of scapegoating, this analysis main- 
tains that the victimized groups evoke hostile 
responses from people who are ready to act 
aggressively; these groups have appropriate 
cue properties as well as being safe and visible 
targets—cf. Berkowitz, 1962; Berkowitz & 
Green, 1962). E 

Although there is considerable evidence that 
is consistent with this formulation (cf. Berko- 
witz, 1964), attempts to apply it to the conse- 
quences of movie aggression have led to some- 
what ambiguous results (Berkowitz, 1965). 
Male college students were either angered or 
treated in a neutral fashion by a person who 
had been labeled either as a college boxer or 
a speech major. After this, the subjects wit- 
nessed either a prize fight or neutral film 
scene. It was found that the anger instigator 
received the greatest volume of aggression 
when the subjects had seen the prize fight 
and the anger instigator was said to be a 
college boxer; his label-induced association 
with the aggressive scene could have caused 
him to elicit aggressive responses from the 
men who were ready to act aggressively. How- 
ever, there was also an indication that the 
label “boxer” could have strengthened the 
person’s cue value for aggression regardless of 
the nature of the film witnessed by the sub- 
jects. This latter finding confirms the im- 
portance of the available target’s cue value 
for aggression. But in the context of this 
study it raises a question as to whether the 
target’s association with the observed vio- 
lence had contributed to his aggression-elicit- 
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ing properties. The present experiment is an- | 
other test of the eliciting-cue hypothesis. This 
time, however, the association with the ag- 
gressive scene is varied by means of the avail- 
able target's name rather than his supposed 
role. 


METHOD 
Subjects 


The subjects were 88 male undergraduates at the 
University of Wisconsin. Seventy-two of these peo- 
ple had volunteered from sections of the introduc- 
tory psychology course in order to earn points 
counting toward their final grade. The remaining 16 
subjects were recruited from an introductory sociol- 
ogy course several weeks later without offering any 
grade-increasing inducements and were distributed 
evenly among the eight treatment groups. 


Procedure 


Three independent variables were arranged in a 
2X2X2 factorial design so that some subjects 
would be (a) angered, (b) by a person having a 
name-mediated association, (c) with an aggressive 
scene. When each subject arrived at the laboratory 
he was met by a peer (actually the experimenters 
accomplice) and the experimenter. The first experi- 
mental treatment was carried out by asking the two 
men what their names were. For half of the cases 
the accomplice identified himself as Kirk Anderson 
while for the remaining men he said his name was 
Bob Anderson. 1 

Following this, the experimenter said the experi- 
ment involved the administration of a mild electric 
shock and gave the subject an opportunity to with- 
draw from the study if he so desired. He then 
showed the men two rooms, one containing various 
sorts of apparatus which, he said, were instruments 
for giving and receiving electric shocks, and the 
second containing a motion picture projector and 
screen. In this latter room the experimenter de- 
scribed the experiment as dealing with problem- 
solving ability under stress. One person, and the 
experimenter indicated that the subject was 19 
take this role, would have to work on a probka 
knowing the other person (the accomplice) woul 
judge the quality of his solution. The accomplice 
would evaluate the subject's performance by giving 
the subject from 1 to 10 electric shocks; the pod 
the solution the greater the number of shocks thal 
the subject was to receive. 

The accomplice then left to go into the Nu 
containing the electrical apparatus, and the subjec 
was given his problem: to suggest how an auto- 
motive service station could attract new customers. 
Five minutes later the experimenter returned: 
picked up the subjects written solution, 8^ 
strapped the shock electrode onto the subject’s oe 
He then left the room again, ostensibly to bring ins 
subject’s work to the other person for judging. E 
minute later the accomplice in the adjoining 10° 
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administered either one shock (nonangered condi- 

. tion) or seven shocks (angered condition) to the 
subject. After waiting 30 seconds, the experimenter 
returned to the subject, asked him how many shocks 
he had received, and then administered a brief 
questionnaire on which the subject rated his mood 
on four separate scales.? 

While the subject was responding to this form, 
the experimenter recalled the accomplice. Then as 
soon as the subject had finished, the experimenter 
said he would show the two men a brief film in 
order to study the effects of a diversion upon 
problem-solving effectiveness. Half of the subjects 
(aggressive movie condition) saw the fight scene 
from the movie Champion. The experimenter in- 
troduced this 7-minute film clip by giving them the 
“justified aggression” synopsis, According to earlier 
findings (Berkowitz & Rawlings, 1963), this- context 
seems to lower inhibitions against aggression. Fur- 
ther, in the aggressive movie-Kirk condition the ex- 
perimenter casually but pointedly remarked that the 
first name of the movie protagonist was the same as 
that of the other person, that is, the accomplice. 
This was done to make sure that there was a name- 
mediated connection between the experimenter’s con- 
federate and the witnessed violence when the ac- 
complice was said to be “Kirk Anderson.” The other 
half of the subjects were shown an equally long and 
exciting movie of a track race between the first two 
men to run the mile in less than 4 minutes. 

Upon conclusion of the 7-minute film clip, the 
experimenter again sent the accomplice from the 
room with instructions to write his solution to the 
sale-promotion problem. The subject was informed 
that he would be given the other person’s solution 
and then was to evaluate it by shocking the other 
person from 1 to 10 times. Five minutes later the 
experimenter brought the subject a written prob- 
lem solution saying this was the other person’s work 
but which was actually previously constructed to be 
standard for all conditions. He told the subject to 
shock the other person as many times as he thought 
appropriate. The experimenter then went to the con- 
trol room to record the number and duration of the 
shocks supposedly being given to the accomplice.? 


? The one shock given to the men in the non- 
angered conditions could have introduced a ceiling 
effect limiting the number of shocks administered 
in these conditions by defining what was the ap- 
propriate number. Contrary to this possibility, how- 
ever, questionnaire findings indicate that the sub- 
jects receiving one shock were much less hostile 
toward the confederate than subjects getting seven 
shocks, 

3 Earlier studies in our program employed total 
shock duration as one of the aggression measures. 
Since this score obviously has a high positive cor- 
relation with number of shocks, we here experi- 
mented with another measure, mean shock duration. 
Our results with this score proved quite disappoint- 
ing. There were negative correlations between shock 
number and the mean duration of each shock in six 
of the eight conditions, suggesting that a “law of 
least effort” may have been operating to some extent, 


. 
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After waiting 30 seconds, the experimenter returned 
to the experimental room and gave the subject the 
final-questionnaire on which the subject indicated 
how much he liked the accomplice. When this form 
was completed the experimenter explained the decep- 
tions that had been practiced upon the subject and 
asked him not to discuss the experiment with any- 
one else for the remainder of the semester. 


—-- RESULTS 


Effectiveness of the Experimental Manipu- 
lations 


Since the experiment depended upon the 
proper registering of the accomplice’s name, 
the final questionnaire asked each subject to 
write down “the other person's" name. All 
88 men were correct. 

There also were several checks of the suc- 
cess of the anger induction. First, each sub- 
ject was asked how many shocks he had 
received and, again, each person correctly 
recalled the number of shocks given to him. 
More directly relevant to the arousal of emo- 
tion, after receiving the shocks each subject 
also rated his mood on a brief four-item 
questionnaire. The *only item yielding a sig- 
nificant effect by analysis of variance was 
the measure of how “angry” or “placid” the 
subject felt; the men given seven shocks re- 
ported themselves as being reliably angrier 
than the men shocked only once. There were 
no other significant differences. Table 1 pre- 
sents the mean anger rating in each of the 
eight experimental conditions. 

We might also note at this time the sig- 
nificant main effects for anger-nonanger on 
the final questionnaire. In comparison to the 
men getting only one shock, those people 
receiving seven shocks expressed a signifi- 
cantly lower preference for the accomplice 
as a partner in any subsequent experiment, 
indicated a reliably weaker desire to know 
the accomplice better, and were significantly 
more opposed to him as a possible roommate. 
All in all, there can be little doubt that the 
seven shocks had made the subjects angry 
with the experimenter’s accomplice. The find- 
ings obtained with these ratings and. the 
earlier mood reports, also suggest that the 
confederate's name had not influenced either 


and that this could have restricted the utility of the 
duration measure. ss 
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TABLE 1 
MEAN RATING OF FELT ANGER 
Aggressive film Track film 

Accomplice's 

name 

Angered | Nonangered | Angered | Nonangered 
Kirk 1.36, 11.27, 7.275 10.55, 
Bob 6.00, 12.09, 7.27. 11.27, 


Note.—The lower the score the greater the felt anger. Cells 
having a subscript in common are not significantly different 
(at the .05 level) by Duncan multiple range test. 


the subjects’ level of felt anger or their atti- 
tudes toward this person. 


Test of the Aggression-Evoking-Cue Hy- 
pothesis 


The primary measure of aggression in this 
experiment was the number of shocks admin- 
istered by each subject. As we had expected, 
the men displaying the greatest number of 
aggressive responses were those who had seen 
the prize-fight film after they were provoked 
and who then had an opportunity to attack 
their frustrater named “Kirk Anderson.” The 
accomplice’s name had apparently caused him 
to be associated with the violent scene so 
that he could then elicit strong overt hos- 
tility from the people who, being angered, 
were primed to act aggressively. These sub- 
jects gave a significantly greater number of 
shocks than the men in any of the other 
conditions, The mean number of shocks in 
each condition is given in Table 2. 


Other Questionnaire Findings 

We have already summarized the major 
findings obtained with the final questionnaire, 
administered after the subjects had given the 


accomplice the electric shocks; in general, at 
the end of the experiment the subjects still 


TABLE 2 
MEAN NUMBER oF SHOCKS GIVEN To ACCOMPLICE 


Aggressive film Track film 
Accomplice's 


name 


Angered | Nonangered | Angered Nonangered 


6.09, 1.73, 4.18, 1.54, 
4.55 1.45, 4.00, 1.64, 


Kirk 
Bob 


Note.—Cells having a subscript in common are not signifi- 
cantly different (at the .05 level) by Duncan multiple Tange 
test. 
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disliked the accomplice more after haying | 
received seven shocks from him than after 


getting only one shock. Aside from this, 
however, the pattern of condition differences 
obtained with the questionnaire data did not 
resemble the findings obtained with the shock 
measure. Many of the men could have be- 
come somewhat anxious or guilty after ad- 
ministering the electrical punishment, This 
reaction might have then affected the ques- 
tionnaire responses—either decreasing or 
intensifying the verbal expressions of hostility 
(cf. Berkowitz, 1962, Ch. 8). 

But we can make an assumption here that 


seems warranted in the light of other findings, | 


Those people experiencing a strong instigation 
to aggression may display persistently strong 
aggressive responses over time, responses that 


TABLE 3 


MEAN Propuct-MoMENT CORRELATION BETWEEN 
SHOCK NUMBER AND SUBSEQUENT 
QUESTIONNAIRE HOSTILITY 


Accom- Aggressive film Track film 
plice's 
name 

Angered | Nonangered | Angered | Nonangered 
Kirk «37 (4)*| —.09(1) |—.18(2) | —.15(1) 
Bob |—.16(0| —.27(1) | .10(3) | —.01(1) 


* The numbers in parentheses refer to the number of porq 
correlations of the four computed in each condition. One of tl E 
four positive correlations in the aggressive film-angered-Kir} 
condition attained statistical significance while none of the four 
remaining significant r's were positive. 


are not quickly altered by anxiety-guilt_re- 
actions (cf. Berkowitz & Holmes, 1960, cited 
in Berkowitz, 1962, p. 96). Thus, if the final 
questionnaire ratings are, at least in patt, 
expressions of hostility, the condition 1n 
which the strongest aggressive response tend- 
encies had been activated should exhibit the 
highest positive correlation between shock 
number and questionnaire ratings. In order 
to test this reasoning four product-moment 
correlations were computed in each of the 
eight conditions: between the number of 
Shocks given by each subject and his verbal 
expression of hostility on each of the four 
hostility items in the final questionnaire. 
mean correlation was then obtained for each 
condition after first employing the 7 to 7 
transformation. The results are shown 1? 
Table 3. 
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While none of the condition differences are 
' statistically reliable, the general pattern is 
- consistent with the shock data and our theo- 
retical expectations. First, combining the four 
angered and the four nonangered groups, we 
find that 62.5% of the 16 correlations in the 
strongly provoked groups were positive but 
only 25% of the relationships in the non- 
angered conditions were in this direction 
(à = 4.58, p = .05, if we treat the correla- 
tions within a group as independent events). 
Thus, strong anger arousal tended to produce 
relatively persistent hostile tendencies; the 
people exhibiting comparatively strong open 
aggression on the first occasion generally ex- 
pressed a high level of aggression the next 
time measurements were obtained shortly 
afterwards. 

Turning now to the specific theoretical 
expectations, Table 3 also shows that the 
condition having the highest mean positive 
correlation was the one predicted to have the 
strongest aggressive responses: the angered- 
aggressive film-Kirk group. The strong acti- 
vation of aggression in this condition result- 
ing from the combination of provocation and 
aggression-eliciting cues led to longer lasting 
aggressive response tendencies as well as the 
high volume of electrical attacks upon the 
accomplice. 

"These effects of the name attributed to the 
accomplice raised a further question. Did 
the name “Kirk” serve as an aggression- 
evoking cue after the subjects had seen Kirk 
Douglas being beaten in part because of prior 
attitudes? Disliked objects may have the cue 
properties enabling them to elicit aggressive 
responses from people who are ready to act 
aggressively (Berkowitz, 1962; Berkowitz & 
Green, 1962). It is conceivable, then, that 
the present college students had some nega- 
tive attitudes toward the name “Kirk” and/or 
the actor Kirk Douglas which generalized 
to the accomplice, “Kirk Anderson." These 
attitudes could have facilitated the expression 
of open hostility toward Kirk Anderson. An 
additional investigation was conducted as an 
examination of this possibility. A sample of 
44 male university students comparable to 
those who participated in the experiment was 
given a list of 14 masculine and feminine 
first names and was asked to indicate how 
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much they liked or disliked each name on a 
7-point scale. The results demonstrated that 
there were no particularly strong feelings con- 
nected with the name Kirk. Although the 
subjects tended to rate the name on the nega- 
tive side of the scale (mean rating = 4.50), 
this mean rating was not significantly different 
from the neutral point (p = .22, one-tailed 
test). 

These subjects, however, did tend to asso- 
ciate the name Kirk with Kirk Douglas. 
When asked on a subsequent form to write 
down what family names came to mind in 
response to each of the 17 first names, 40 of 
the 44 respondents listed the patronym 
“Douglas” after the first name Kirk. But 
while Kirk may be connected with a particu- 
lar person, this individual is not necessarily 
disliked. This is indicated by the findings of 
a third questionnaire on which the respondents 
rated their attitudes toward each of 14 public 
figures. Kirk Douglas obtained a mean rating 
of 3.75 on the 7-point scales used in this 
instrument, a mean score which again is not 
significantly different from the neutral point. 


DISCUSSION 


All in all, the above findings lend compara- 
tively clear support to the theoretical analysis 
upon which the present study was based. Ob- 
served aggression, we have shown, does not 
necessarily lead to open aggression against 
anyone. Particular targets are most likely to 
be attacked, and these are objects having 
appropriate, aggression-eliciting cue proper- 
ties, In the present case the target’s cue value 
is derived from a label-mediated association 
with the witnessed aggressive scene—or more 
specifically, with the victim of the observed 
hostility, Having this association, the target 
evokes aggressive responses from the audience 
members who are primed to act aggressively 
and whose restraints against aggression are 
fairly weak. 

But assuming the essential validity of this 
analysis, we can also raise a number of im- 
portant unanswered questions. For one thing, 
did the accomplice draw the greatest number 
of aggressive responses from the people in the 
angered-aggressive film-Kirk group because 
of his connection with aggression generally, 
or because he was most closely associated 
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with the victim of the observed violence? It 
is conceivable that an object's aggressive cue 
properties are derived fundamentally from 
the object's connection with aggressive be- 
havior, whether these acts are given or re- 
ceived. Thus, if college boxers tend to draw 
stronger hostility than do speech majors, as 
seems to be the case (Berkowitz, 1965), this 
may be due to the former role's closer associ- 
ation with fighting in general. The same point 
can perhaps be made with regard to the 
presumed aggression-eliciting properties of 
disliked people. Here again the disliked object 
may somehow be associated with aggression. 

A second question has to do with the fre- 
quently exciting nature of observed aggres- 
sion. In addition to their specific content, 
violent scenes typically are fairly exciting. 
This excitement means, of course, that there 
is a relatively strong arousal state within the 
observer, and this high arousal level might 
well contribute to the strength of the aggres- 
sive responses elicited in the situation. There 
is a suggestion to this effect in the data 
summarized in Table 2. Looking at the mean 
number of shocks delivered to the accomplice 
“Bob,” we can see that the angered subjects 
seeing the prize fight did not express reliably 
stronger hostility than the men witnessing the 
exciting track race. Other experiments have 
obtained a much more substantial difference 
when the same aggressive scene was com- 
pared with a less arousing neutral film (e.g., 
Berkowitz & Rawlings, 1963). The exciting 
nature of the prize fight might have con- 
tributed to the condition differences obtained 
in the earlier research. Whether this is true 
or not, however, film-engendered differences 
in degree of arousal cannot account for the 
present findings. The aggressive scene was 
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probably not more exciting when the accom- 
plice’s name was Kirk than when he was 
called Bob. For that matter, as is shown in 
Table 1, the subjects did not feel greater 
anger toward Kirk than toward Bob.* 


* An experiment by the present writers, conducted 
after this article went to press, indicates that associ- 
ations with nonaggressive, exciting scenes do not 
increase the available target's aggressive cue prop- 
erties. Subjects were made to be angry with a con- 
federate, whose name in some cases was said to 
be either “Landy” or “Bannister,” and then either 
saw the track race film or sat still for an equivalent 
time. For the people shown the track film the con- 
federates never connected him either with the win- 
ner or loser in the observed race. The confederate 
did not receive more shocks after the track film than 
after the no-film treatment and regardless of whether 
or not his name connected him with the track film 
did not do so. The important association evidently . 
is with an aggressive scene. 
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CONNOTATIONS OF RACIAL CONCEPTS AND COLOR NAMES" 


JOHN E. WILLIAMS 
Wake Forest College 


Language custom designates racial groups by the color names white, black, red, 
yellow, and brown, a practice which may condition the connotative meanings 
of color names to concepts representing racial groups. This study compared the 
connotative meanings of triads of color-linked concepts consisting of: color 
names (e.g. black), color-person concepts (e.g, black person), and ethnic con- 
cepts (e.g, Negro). For Caucasian Ss from both South and Midwest, color- 
linked concepts were substantially more similar in meaning than were non- 
color-linked concepts. The evaluative (good-bad) connotations of ethnic con- 
cepts were predictable from their associated color names. Different results were 
obtained for Negro Ss. The findings were interpreted as indicating that the 
color-coding of racial groups is related to the perception of these groups and 


the favorability of attitudes toward them. 


Language custom designates racial groups 
according to a "color code" in which Cau- 
casians are called white, Negroes are referred 
to as black, and Orientals, American Indians, 
and Southwest Asians are designated, respec- 
tively, as yellow, red, and brown. On reflec- 
tion, it is obvious that this color code has 
little descriptive accuracy with regard to skin 
color; Caucasians are not literally “white,” 
nor is the modal American Negro "black," 
and yet, applications of this color nomen- 
clature are encountered daily, for example, in 
popular press accounts of racial problems or 
incidents, Although admittedly convenient, 
and seemingly innocuous, the practice of color 
coding may have hidden and, perhaps, un- 
desirable effects since color names such as 
white and black are regularly used in other 
contexts as general cultural symbols to con- 
“vey different connotative meanings such as 
goodness and badness (Williams, 1964). The 
present study was concerned with the ques- 
tion of whether the color-coding practice is 
related to the way in which different racial 
groups are perceived. 

In an earlier investigation, Williams 
(1964) studied the connotative meanings of 
color names presented in a nonracial context. 


1 This research was supported, in part, by a grant 
from the Wake Forest College Graduate Council. 
he author is grateful to Lafayette Parker and Jef- 
erson Humphrey of Winston-Salem State College, 
to Bertram Spiller of Washburn University, and to 
D Hicks, formerly of Washburn University, for 

eir assistance in data collection. 


It was demonstrated that there are striking 
differences in the connotative meanings of 
color names and that these meanings are rela- 
tively stable across both regional and racial 
lines. For example, the connotative meaning 
of the color name white was found to be 
“good,” “active,” and “weak,” while the color 
name black was “bad,” “passive,” and 
“strong.” Since it is known that the con- 
notative meanings of words can be classically 
conditioned to other words (Staats & Staats, 
1957, 1958), it seemed likely that the con- 
sistent association of a color name, such as 
black, with a racial concept, such as Negro, 
would tend to condition the connotations of 
the former to the latter. In this way, the 
practice of color coding might operate as a 
background factor in the development and/or 
maintenance of attitudes toward racial groups. 

One way to observe the effects of using 
color names to designate groups of persons 
would be to study the meaning similarity 
between color names, as such, and color 
names used as adjectives to describe people. 
For example, is there a similarity in the con- 
notative meanings of the concepts black and 
black person? One might also study color- 
person concepts in relation to color-code re- 
lated ethnic concepts (e.g., black person and 
Negro) and observe the degree of connotative 
meaning similarity. In a third type of com- 
parison, one might study the meaning simi- 
larity of ethnic names (e.g., Negro, Cau- 
casian) to the color names with which the 
color code associates them. 
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The general hypothesis for this study was 
that similarities in connotative meaning will 
be found to be greater among concepts which 
are linked by the color code than among 
concepts not so related, Although no specific 
predictions were made, it was anticipated 
that Caucasian subjects and Negro subjects 
might differ in their ratings of the racially 
significant concepts of the present study, in 
spite of their generally similar performance 
in rating color names in the earlier study 
(Williams, 1964). 


METHOD 


This study was an extension of Williams’ 
(1964) earlier study of the connotative mean- 
ings of color names, with the same subject 
populations and data gathering procedures 
being employed. The reader is referred to the 
earlier study for a more detailed description 
of materials and procedures. 


Subjects 


Subjects were introductory psychology students 
from three institutions: Caucasian? students from 
Wake Forest College, a liberal arts college in North 
Carolina; Caucasian students from Washburn Uni- 
versity, a municipal university in Kansas; and Negro 
students from Winston-Salem State College, a liberal 
arts college in North Carolina. The numbers of 
subjects from each institution rating the color- 
person concepts were, respectively, 86, 88, and 106. 
The numbers rating the ethnic-national concepts 
were 110, 70, and 60. In the two Caucasian groups, 
no subject rated both groups of concepts. In the 
Negro group, approximately one quarter of the 
subjects rating the ethnic-national concepts had 
rated the color-person concepts approximately 1 
month earlier, All research groups were composed 
of equal numbers of men and women. 


Semantic Differential 


Other than the concepts rated, the rating pro- 
cedure used was identical to that employed by 
Williams (1964) which was based on the work of 
Osgood, Suci, and Tannenbaum (1957). 

Color-person concepts. The 10 color-person con- 
cepts were formed by pairing the word person with 
each of the 10 color names studied by Williams 
(1964), that is, black person, white person, brown 
person, yellow person, red person, blue person, green 
person, purple person, orange person, and gray 
person. The concepts were presented to the subject 
in random order and rated on 12 scales, 6 of which 
had been chosen to reflect the evaluation (E) factor, 


? The term Caucasian is used throughout this 
paper in its popular meaning of “white person” 
rather than in any technical ethnological sense. 
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Ethnic-national concepts. This group of 14 con- ; 
cepts included 5 concepts selected because of their V 
relevance to color and color-person concepts— 
Negro, Caucasian, Indian (Asiatic), Oriental, Indian 
(American); 4 other ethnic-national concepts— 
American, African, Chinese, Japanese; and 5 general 
reference concepts—citizen, foreigner, friend, enemy, 
and person, Using the same 12-scale rating sheet . 
the subjects rated person first; then the other 4 
reference concepts in random order; and, finally the * 
9 ethnic-national concepts in random order. 


factor. / 
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Procedure 


= 


The procedure was administered to groups of sub- 
jects by an experimenter of the same race using 
conventional semantic differential rating instructions. 
The concepts to be rated were presented to the 
subject in a mimeographed booklet with a single l 
concept name heading each page and with the 12 ; 
rating scales presented below. Subjects recorded 
their sex but no other identifying information Was | 
requested. 


RESULTS 


The basic data for study consisted of the 
E, P, and A scores for the five race-related 
color-person concepts—black person, white | 
person, brown person, yellow person, and red 
person; and for five corresponding ethnic con- 
cepts—Negro, Caucasian, Indian (Asiatic), 
Oriental, and Indian (American). n 

Each rating sheet was scored by assigning 
the digits 1-7 to the 7 positions on € 
rating scale and summing the ratings on the 
appropriate scales for the E, P, and A factors. 
Low scores represented the “good” end of the | 
E dimension, the “weak” end of the P di- 
mension, and the “passive” end of the A | 
dimension. Separate analyses by sex were | 
not made after inspection of the data indi- 


cated that men and women subjects were 1€ 
sponding to the task in essentially the same 
manner. Table 1 displays the three mean 
factor scores for each concept, separately for 
each of the three groups of subjects. Includ 

in Table 1 are the scores for the five race 


related color names from the Williams ( 1964) 


study. 


Intercorrelations of E, P, and A Scores 


In the earlier study of color names, it we 
found that the E, P, and A scores were C047. 
acterized by a high degree of statistical n | 
pendence, thus supporting the notion that #% 


na 
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TABLE 1 
MEAN SEMANTIC DIFFERENTIAL SCORES FOR CONCEPTS RATED BY CAUCASIANS FROM THE SOUTH, 
CAUCASIANS FROM THE MIDWEST, AND NEGROES FROM THE SOUTH 
Evaluation Potency Activity 
Caucasian Caucasian Caucasian 
Negro Negro Negro 
South Midwest South Midwest South Midwest 

White 1.79 1.85 2.05 3.60 3.49 3.52 4.75 4.90 5.10 
Black 5.09 4.98 4.11 5.98 6.29 5.70 3.31 3.69 3.63 
Brown 4.45 4.25 3.82 4.95 5.20 4.92 2.74 2.95 3.51 
Yellow 2.82 2.64 2.52 3.21 3.10 3.24 4.99 4.77 5.00 
Red 3.18 3.18 3.08 5.58 5.96 5.19 6.23 6.32 5.77 
White person 2.63 2.45 3.75 424 4.26 3.53 5.24 5.06 4.48 
Black person 4.52 4.16 3.89 5.79 5.22 5.23 3.53 3.73 4.45 
Brown person 4.02 4.14 3.22 5.01 4.86 4.72 3.61 2:0 4.58 
Yellow person 3.61 3.81 3.47 2.69 2.69 3.54 4.31 4.27 4.49 
Red person 3.69 3.72 3.83 5.15 5.21 4.69 5.51 5.52 4.96 
Caucasian 2.69 2.98 3.89 4.86 4.66 4.14 5.17 5.06 4.42 
' Negro 4.08 3.92 2.89 5.10 4.72 4.93 3.52 3.99 4.12 
Indian (Asiatic) 3.65 3.84 3.68 3.83 3.79 4.26 3.85 3.84 4.27 
Oriental 3.20 3.53 3.63 3.01 3.10 3.85 4.35 4.24 4.37 
Indian (American) | 3.24 3.46 3.53 4.91 4.64 4.47 4.66 4.41 4.72 
Person 2.56 2.88 2.69 447 4.30 4.40 5.10 4.10 4.93 
Friend 1.91 1.85 2.40 4.94 4.77 4.70 5.55 $.15 5.28 
Enemy 5.68 5.67 5.41 4.74 4.53 4.07 4.73 422 4,18 
Citizen 2.74 2.69 2.81 4.71 4.60 4.64 |, 5.00 5.00 4.92 
Foreigner 3.18 3.18 3.41 4.09 3.98 4.00 4.45 445 4.22 


Note.— Scores shown are mean factor scores divided by number of scales. Color-name data are from Williams (1964). 


different scores reflected different aspects of 
connotative meaning. In order to assess the 
degree of independence of the three scores in 
the present study, correlation coefficients were 
computed for each pair of the three scores, 
for each of the five race-related color-person 
Concepts and for each of the five correspond- 
ing ethnic concepts, separately for each of 
the three groups of subjects. For the color- 
person concepts, the median coefficients were 
as follows: E versus P, r= +.03; E versus 
A,r = —.40; P versus A, 7 = +.29. For the 
ethnic concepts the median coefficients were: 
E versus P, r = —.28; E versus A, r = —.46; 
P versus A, r = +.40. While these correla- 
tions indicated that the E, P, and A scores 
Were not statistically independent, the amount 
of common variance was not high (0-206) 
and it was judged useful to analyze and 
Téport the scores separately. 


Comparison of Caucasian Groups 


The E, P, and A scores of the two Cau- 
Casian groups were analyzed to determine 


whether the connotative meanings of the con- 
cepts under study were rated differently by 
Southern and Midwestern college students. 

Color-person concepts. For each of the 
three scores (E, P, and A), separately, a 
Lindquist (1953) Type I analysis of vari- 
ance was performed with the five color-person 
concepts comprising the within-subjects di- 
mension and the two groups of Caucasian 
subjects, the between-subjects dimension. In 
each analysis, the between-groups and inter- 
action effects were not significant while the 
between-concepts effect was highly significant 
($ < .001). 

Ethnic concepts. The analyses of the E, P, 
and A scores for the five ethnic concepts 
paralleled the color-person analyses just de- 
scribed with similar findings: nonsignificant 
between-groups and interaction effects, and 
highly significant (p < .001) between-concepts 
effects. 

On the basis of the foregoing analyses, it 
was judged appropriate to pool the data of 
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the Southern and Midwestern Caucasian 
groups in subsequent analyses. 


Comparison of Caucasian and Negro Groups 


This analysis was to determine whether 
the Negro subjects were responding to the 
race-related color-person and ethnic concepts 
in a manner similar to that of the Caucasian 
subjects. For each of the two groups of five 
concepts, three Lindquist (1953) Type I 
analyses of variance were run, one each for 
the scores E, P, and A. In these analyses the 
five concepts represented the within-subjects 
dimension and the Causasian-Negro classifi- 
cation was the between-subjects dimension. 
In each of the six analyses, the interaction 
effect was highly significant (p < .001) indi- 
cating that the two groups of subjects were 
responding quite differently to the concepts 
and, hence, that the subsequent analyses 
should be made separately for Caucasian and 
Negro subjects. 


Caucasian Subjects: Comparison of Related 
Color, Color-Person, and Ethnic Concepts 


In the top portion of Figure 1 are dis- 
played the scores for related triads of color, 
color-person, and ethnic concepts along the 
E dimension. The vertical lines connect color- 
code related triads of concepts, that is, white- 
white person-Caucasian; black-black person- 
Negro; brown-brown person-Indian (Asiatic) ; 
yellow-yellow person-Oriental; red-red person- 
Indian (American). The general similarity of 
rank orders along the E dimension is quite 
apparent with the concepts white, white per- 
son, and Caucasian rated most *good," the 
concepts black, black person, and Negro rated 
most “bad,” and the other three triads occupy- 
ing intermediate positions. Thus, one can pre- 
dict the rank position of color-person and 
ethnic concepts on the E dimension quite 
accurately on the basis of the rank position 
of the color concept with which they are 
conventionally associated. 

The middle and lower thirds of Figure 1 
indicate a substantial degree of rank-order 
similarity for related triads along the P and 
A dimensions but not as high a degree of con- 
sistency as that found for the E dimension. 
On the P dimension, the color-person con- 
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CAUCASIAN Ss 


EVALUATION 4 
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Fic. 1. Semantic differential scores of Caucasian 
subjects for color-linked triads of color, color-persom 1 
and ethnic concepts. (Vertical lines connect triads, 
as follows: W — white, white person, Caucasian; 
Y = yellow, yellow person, Oriental; R= red, red 
person, Indian (American); BR — brown, brown 
person, Indian (Asiatic); BL — black, black person, 
Negro.) 


cepts maintain the same rank order as the 
color concepts, but a shift out of rank order 
is seen for the ethnic concepts Caucasian and 
Indian (Asiatic). On the A dimension, there 
is some shifting of rank order both from 
color to color-person concepts, and from 
color-person concepts to ethnic concepts. 

D scores, The similarities in meaning see 
in Figure 1 may be conveniently summarized 
by the use of the D index developed by 
Osgood and his associates (1957, pp. 89 f). 
Based on the generalized distance formula 
of solid geometry, D provides an index 0 
the distance between pairs of concepts m 
three-dimensional semantic space. Osgood et 
al. (p. 93), note that the use of D to indicate 
absolute semantic distances requires the 2 
sumption that the three variables employe 
are statistically independent, a condition not 
met in the current instance (see above). 
was judged, however, that in the present 
situation where interest was in the relativi 
magnitude of different D scores, the parti? | 
violation of this assumption was not critical. | 

Applying the formula of Osgood et 4 
(1957, p. 91), D scores were compute 
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between: each color and each color-person 
concept, each color and each ethnic concept, 
and each color-person and each ethnic con- 
cept. To illustrate, the D score for any two 
concepts was obtained by: computing the 
difference between the two mean E scale 
scores, the two mean P scale scores, and the 
two mean A scale scores; squaring each of 
these differences; summing the squares and 
taking the square root of the sum. From this, 
it can be seen that low D scores indicate high 
similarity in overall meaning while high 
scores reflect low similarity. It should also 
be noted that the D procedure weights all 
three difference scores equally, ignoring the 
greater pervasiveness of the E factor (in a 
factor analysis sense) and also its greater 
relationship to positive and negative attitudes 
as traditionally measured (Osgood et al., 
1957, p. 193). 

Table 2 lists the 75 D scores obtained 
when each concept in one class was compared 
with each concept in the other two classes. 
These D scores provide a convenient place 
for a formal testing of the hypothesis that 
similarities in connotative meaning will be 
found to be greater among concepts linked 
by the color code than among concepts not 
so related. To test this hypothesis, a Mann- 
Whitney U test (Peatman, 1963) was com- 
puted for each third of Table 2 separately. 
For the comparison of color and color-person 

" concepts at the upper left, the D score for 
the five related concepts (along the diagonal) 
were found to be significantly ($ < .001) 
smaller than the 20 D scores for unrelated 
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concepts. For the comparison of color and 
ethnic concepts at the upper right, the related 
D scores were again significantly (p < .025) 
smaller, as were the related D scores for the 
comparison of color-person and ethnic con- 
cepts at the lower right (p< .001). The 
consistency of the predicted effect may also 
be observed by comparing any one of the 
D scores for related concepts with the mean 
of the other four D scores in its particular 
row or column. Jn every instance, the D for 
related concepts is smaller than the mean 
of the other four unrelated D scores. 
These findings were taken to indicate that 
the general hypothesis was confirmed for 
Caucasian subjects. 

The relative similarity of color-linked and 
non-color-linked concepts for Caucasian sub- 
jects is summarized in the upper portion of 
Figure 2. In the triangle on the left, the 
distance between any two vertices represents 
the mean of the five D scores for color-related 
pairs of concepts. For example, the distance 
between the points designated C and CP was 
obtained by averaging the five D scores for 
related color and color-person concepts (i.e. 
white versus white person, black versus black 
person, etc.). The distance between the verti- 
ces labeled CP and E represents the average 
of the five D scores for related color-person 
and ethnic concepts (ie., white person versus 
Caucasian, black person versus Negro, etc.). 
The distance between the C and E vertices 
represents the average of the five D scores for 
white versus Caucasian, black versus Negro, 
etc, Thus, the distances shown in the small 


TABLE 2 


SEMANTIC Distances (D SCORES) FOR 
COLOR versus COLOR PERSON, 


CAUCASIAN SUBJECTS BETWEEN THREE Groups or CONCEPTS: 
COLOR VERSUS ETHNIC, 


AND COLOR PERSON VERSUS ETHNIC 


: Indian 

i t di : 

be penes Brown eren Taed, [Caucasian] Negro (Asiatic) | Oriental | (Ameri- 

White 11 36 | 288 | 211 | 262 | 165 | 282 243 | 164 | 2.00 
Black in $39 | asi | 374 | 269 | 308 | 156 | 246 358 | 241 
Brown $08 | 093 | 094 | 286 | 280 | 282 | 098 174 | 271 | 229 
Yellow 205 | 272 | 244 | 125 | 260 | 162 | 248 157 | 085 | 176 
Red 29s | 3:02 | 284 | 366 | 106 | 152 | 280 313 | 332 | 194 
White 059 | 217 | 181 | 166 | 1.13 
Segen 247 | 085 | 194 | 2.86 | 1.75 
Brown person 1.93 0.04 112 243 116 
Yellow person 242 | 232 | 120 | 052 | 2.16 
Red person tos | 184 | 215 | 247 | 1.09 
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lated concepts were generally smaller but in 
neither case significantly so. Thus, the data of 
the Negro subjects provided only slight sup- 
port for the hypothesis under investigation. 

The relative similarity of color-linked, and 
non-color-linked concepts for Negro subjects 
is summarized in the lower portion of Figure 
2. As would be expected from the statistical 
tests discussed above, the related concepts 
triangle is somewhat smaller than the un- 
related concepts triangle. It seems clear, how- 
ever, that classification of concepts as “color- 
related” and “color-unrelated” has much less 
significance for the Negro subjects than for 
the Caucasian subjects. The relative size of 
the two right-hand triangles illustrates, again, 
the lesser degree of concept differentiation 
by the Negro subjects. 

Certain of the D scores in Table 3 are of 
particular interest. These Negro subjects 
rated the concept Caucasian as most similar 
to the concept white person and most dif- 
ferent from the concepts black person and 
brown person; while Negro was most similar 
to brown person and least similar to white 
person. It is noteworthy that the largest D 
score among the ethnic versus color-person 
comparisons was that between Negro and 
white person. In the color versus ethnic com- 
parison, it is of interest that these subjects 
rated Caucasian as least like white and Negro 
as least like black. 


Comparisons with Reference Concepts 


It will be recalled that certain reference 
concepts had been rated along with the ethnic 
concepts. Included were the concept person 
and two sets of logically contrasted concepts, 
namely, friend and enemy and citizen and 
foreigner. It was considered of interest to 
study the connotative similarity of the con- 
cepts Caucasian and Negro to these reference 
concepts. (Mean scores for the reference 
concepts are given in Table 1.) 

D scores computed for the data of the Cau- 
casian subjects revealed that the concept 
Caucasian had a higher similarity to citizen 
(.20) than to foreigner (1.01) and a higher 
similarity to friend (1.06) than to enemy 
(2.94). On the other hand, the concept Negro 
was seen as more similar to foreigner (1.48) 
than to citizen (1.84) and more similar to 
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enemy (1.88) than to friend (2.87). For 
these Caucasian subjects, the similarity of the 
concept person to the five ethnic concepts 
was, in decreasing order, Caucasian (.45), 
Indian (American) (.84), Indian (Asiatic) 
(1.60), Oriental (1.62), and Negro (1.88). 
The similarity of the concept person to the 
five color-person concepts was, in decreasing 
order, white person (.31), red person (1.40), 
brown person (1.94), black person (1.95), 
and yellow person (2.09). The mean of the 
D scores between person and the color-person 
concepts was 1.54, This figure may be com- 
pared with the mean D score of 1.03 between 
color and color-person concepts, noted above. 

Turning to the data of the Negro subjects, 
the concept Negro was found to be more 


similar to citizen (.36), than to foreigner, 


(1.18) and more similar to friend (.78) than 
to enemy (2.72). On the other hand, Negro 
subjects saw the concept Caucasian as more 
similar to foreigner (.54) than to citizen 
(1.29) and more similar to enemy (1.54) 
than to friend (1.81). For the Negro sub- 
jects, the similarity of the concept person to 
the five ethnic concepts was, in decreasing 
order, Negro (.62), Indian (American) (.63), 
Oriental (.76), Indian (Asiatic) (.83), and 
Caucasian (1.00). The similarity of the con- 
cept person to the five color-person concepts 
was, in decreasing order, brown person (.50), 
yellow person (.57), red person (1.03), white 
person (1.15), and black person (1.33). 


DISCUSSION 
The data of the Caucasian subjects pro- 


vided evidence in strong support of the: 


hypothesis under investigation; namely, that 
racial concepts have connotative meanings 
similar to the color names with which they 
are linked by the color-coding custom. Al- 
though similarities in meaning were found for 
the A and P dimensions, the most consistent 
similarity was seen on the E dimension. This 
is perhaps the most important finding of the 
study since it is known that score variation 
along the E dimension covaries closely with 
score variation on conventional attitude tests 
(Osgood et al., 1957, p. 193). Using this 
interpretation, we can note that the attitudes 
of Caucasian subjects would appear to be 
most favorable toward Caucasians, somewhat 
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less favorable toward American Indians and 
Orientals, and least favorable toward Asiatic 
Indians and Negroes. This is to be compared 
with their favorable evaluative rating of 
white and progressively less favorable ratings 
of yellow, red, brown, and black. While the 
direction of cause and effect cannot be demon- 
Strated here, these data are consistent with 
the notion that the evaluative connotations 
of color names applied to racial groups are 
one determinant of the favorability of atti- 
tudes toward the racial groups. A hypothesis 
under current investigation is that evaluative 
color connotations—particularly white as 
good and black as bad—are learned early in 
childhood and influence the subsequent devel- 
opment of racial attitudes. 

It was interesting to observe how the mean- 
ing of the word person was modified by the 
use of color adjectives, In the data of the 
Caucasian subjects, it was seen that the 
meanings of the color-person concepts were 
generally more similar to their associated 
color names than to the concept person. Ap- 
parently, the color adjective takes precedence 
over the noun and the connotative meaning 
.communicated by the concept black person 
is black-person rather than black-person. 

The high consistency in the data of the 
two Caucasian groups is worthy of note. As 
in the earlier study of color names (Williams, 
1964), Caucasian students in North Carolina 
and in Kansas were found to rate the con- 
notative meanings of racial concepts in a 
highly similar fashion indicating that the 
hypothesized effects of color coding may have 
- some geographical generality. It would be 
interesting to know whether the same effects 
would be found among Caucasian subjects 
from other geographical regions with differing 
histories and customs in racial matters. 

While it was clear that the Caucasian sub- 
jects saw each triad of color-code related 
concepts as belonging to the same “meaning 
family,” this was not the case for the Negro 
subjects. While generally agreeing with Cau- 
casians on the meanings of color names pre- 
sented in a nonracial context, Negro subjects 
responded to the racial concepts in a quite 
different fashion, particularly along the evalu- 
ative (attitude) dimension, Examples of the 
different ratings of the Negro subjects were 


. 


539 


their rating of Negro as good and Caucasian 
as relatively bad, and their rating of brown 
person as more good than white person and 
black person. It is not surprising, of course, 
to find that the responses of Negro subjects 
to racial concepts differ from Caucasian re- 
sponses since the groups obviously have had 
differential experiences with the concepts. In 
addition, there appears to be developing re- 
Sistance among Negroes to the color-coding 
practice with its connotative significance. 
This is seen in extreme form in the efforts 
of the Black Muslims and others to arbitra- 
ri reverse the conventional symbolism by 
associating black with goodness and white 
with badness. It would seem doubtful that 
such deliberate efforts at reversal can gen- 
erally succeed in a culture where the symbol- 
ism of white as good and black as bad is so 
thoroughly entrenched in literature, religion, 
the mass media, etc. 

A simpler way of dealing with the unfor- 
tunate effects of color coding would be an 
attempt to bypass the problem by a deliberate 
effort to reshape "language habits so that 
groups of persons are not designated by color 
names, For example, if the popular press would 
forego the convenience of discussing racial 
problems in terms of white persons and black 
persons, one important avenue of reinforce- 
ment would be removed. As noted elsewhere 
(Williams, 1964), such a proposal would 
probably encounter resistance from many 
Caucasians who, while perhaps willing to part 
with the designation of Negroes as black or 
brown, would be reluctant to give up the 
designation of their own group as white, with 
its positive evaluative connotations, And, of 
course, the abolition of the custom of color 
coding would not fully solve the color prob- 
lem since there are average differences in skin 
color between Caucasian and Negro persons 
and lighter skin would no doubt continue to 
be positively valued. However, one might say 
that white Americans and black Americans 
will continue to find it very difficult to solve 
their problems, while Americans (with differ- 
ing shades of skin color) would have a better 
chance of doing so. Indications of the prob- 
lems remaining to be solved were seen in the 
comparisons of ethnic concepts and reference 
concepts where it was shown that each racial 
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group saw its own racial designation as most 
similar to the concepts person, friend, and 
citizen and the designation of the other racial 


group as most similar to enemy and foreigner. ` 
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EFFECTS OF TEAM ARRANGEMENT ON 
TEAM PERFORMANCE: 


A LEARNING-THEORETIC ANALYSIS + 


KARL EGERMAN ? 


American Institutes for Research, Pittsburgh, Pennsylvania 


3 groups of 6 2-man teams, differing only in arrangement, underwent 2 major 
phases of training: preteam, where each individual developed a proficiency in 
making a timing response; and team training, where each S used his timing 
skill as a team member. Individual preteam proficiencies and the team arrange- 
ment were the only 2 variables used to predict (a) initial team performance, (b) 
the schedule of reinforcement for each S, and (c) the manner in which team 
performance would change from the initial to the final periods of training. 
This investigation points out the feasibility of applying learning-theoretic 


principles to a study of group behavior. 


As pointed out by Glanzer and Glaser 
(1961), the history of small group research 
utilizing systematic, learning-theory concepts 
as independent variables is a relatively new 
approach in this area. Sidowski, Wyckoff, and 
Tabory (1956), as a means of shaping be- 
havior, punished incorrect responses and pre- 
sented reward for correct responses. Some- 
what more recently, Rosenberg and Hall 
(Rosenberg, 1959; Rosenberg & Hall, 1958) 
used stimulus-response theory to explain the 
behavior of individual organisms in a two- 
person group, identifying the feedback to 
members as being either “direct” (contingent 
upon an individuals own performance), 
“confounded” (based only partly on his own 
performance), and "other's," (based on the 
performance of the other team member). One 
of the interesting findings of the Rosenberg 
and Hall series of studies shows that team 
performance was about equal in both those 
conditions where members received direct 
feedback, and those where they received con- 
founded feedback. Even in a variant of the 
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confounded feedback condition, where one 
team member's response was weighted three 
times that of the other member's, there were 
no significant differences in team perform- 
ance (Hall, 1957). 

Similarly, Glaser, Klaus, and Egerman 
(1962), and Zajonc (1962) demonstrated that 
with continued practice, some multimember 
teams learned to improve their performance 
as a group when the only feedback about 
individual performance was based on the 
overall team performance (a “confounded” 
condition in Rosenberg and Hall’s terminol- 
ogy). A 

On the other hand, not all teams react this 
way. Egerman, Klaus, and Glaser (1962) 
investigated a particular kind of team per- 
formance where two of the members’ tasks 
were performed in “parallel,” that is, where 
one member duplicated the other’s perform- 
ance. On any one trial, correct performance 
by one parallel member made the perform- 
ance of the other member redundant. Teams 
arranged with redundant members showed a 
significant performance decrement in team 
output with continued practice; incorrect re- 
sponses by one of the parallel members were 
reinforced whenever the team as a whole 
performed correctly. Since this occurred inter- 
mittently for both of the parallel members, 
both showed a performance decrement with 
continued performance, with a significant re- 
duction in the team’s overall performance 
level. 
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From these studies, it is evident that teams 
continue to learn under either direct feed- 
back (reinforcement) contingencies or con- 
founded feedback contingencies, provided 
that this feedback represents appropriate in- 
dications of correct individual performance. 
However, whenever team feedback to the 
members inappropriately reinforces incorrect 
individual responses (as in the parallel ar- 
rangement), member proficiency is likely to 
show a decrement which is in turn reflected in 
a team performance decrement; this appears, 
however, despite a continuous schedule of 
reinforcement for correct team responses. 

Clearly, a major factor which requires more 
systematic examination is the influence of the 
team arrangement on. the feedback which 
team members experience. In the present 
study, certain general hypotheses derived 
from an analysis of nongroup learning pa- 
rameters were formulated to test their appli- 
cability in describing the behaviors of indi- 
viduals in various group settings; the hy- 
potheses tested concerned the relationship 
between team performance’ and the environ- 
mental feedback contingencies of members in 
teams which differed in arrangement. 


Team Arrangement 


The experimental paradigm developed in 
this research grew out of previous efforts to 
study the performance of selected military 
teams. Because of the limitations of field re- 
search in examining the individual contribu- 
tions to team performance, the Team Train- 
ing Laboratory at the American Institutes 
for Research undertook a program of research 
in order to more accurately investigate the 
relationship of individual performance and 
team performance characteristics (Klaus & 
Glaser, 1960). The paradigm adopted in- 
corporated those features described in Za- 
jonc’s (1965) description of a “standard 
group task.” Utilizing a timing response as 
the basic task for each team member, it was 
possible to study individual performance de- 
velopment, the effect of varying schedules 
of reinforcement on individual performance, 
and the effect on individual performance of 
manipulating feedback, based either on indi- 
vidual or team performance. Unlike the typi- 
cal “group” problem-solving situations, the 


present experimental paradigm may be de- 
scribed as: having a rigid structure, organi- 


zation, and communication network; having. 


well-defined positions or assignments permit- 
ting an analysis of the contribution made by 
each team member to the unit’s output; re- 
quiring the coordinated participation of sev- 
eral individuals whose specialized perform- 
ances have little ostensible overlap of func- 
tion; requiring at least some minimal level of 
proficiency from each member in order to 
carry out the mission of the team; permitting 
the individual performance of all members to 
be influenced by the performance of the other 
members of the team. 

In this study, three different arrangements 
of two-man teams were investigated. One was 
arranged in “series,” so that both members 
were required to perform correctly to com- 
plete the task. One was arranged in “paral- 
lel,” where either member might perform cor- 
rectly to complete the task. The third was ar- 
ranged so that the performance of only one 
of the team members determined the team 
output; since team performance depended 
only on one member, the term “individual” 
team was coined for it. Each of these teams, 
then, differed in only one ostensible dimen- 
sion. (For an analysis of team performance 
when the structure of a seven-member team is 
successively changed over trials from a paral- 
lel condition to a series condition, see Zajonc 
and Taylor, 1963. The results of this study 
demonstrate that as more “series” members 
are incorporated—from two to seven—the less 
likely is the team to complete its task. Indi- 
vidual performance, however, does continue 
to improve over the entire series of 70 learn- 
ing trials.) 


Determining Feedback Linkages 


Based on the arrangement of team mem- 
bers, it was possible to identify the type of 
“feedback linkage" attending each member's 
performance, This was determined, first, by 
the appropriateness of an individual team 
member’s response, and second, by the sub- 
sequent indication of the team’s success. Four 
feedback linkages were identified: appropri- 
ate reinforcement (AR), appropriate nonre- 
inforcement (AN), inappropriate reinforce- 
ment (IR), and inappropriate nonreinforce- 
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ment (IN). AR occurs in all arrangements 
“when both members perform correctly, and the 
feedback to both indicates correct team per- 
formance. AN occurs to a member when he 
performs incorrectly, and at the same time 
the team does not complete the task. IN oc- 
curs if a member does perform correctly, but 
the team as a whole fails to complete the 
task. IR occurs if a member performs incor- 
rectly, but the team still completes the task. 
(This can happen only in the parallel ar- 
rangement and to that member of the “indi- 
vidual” team—the “partner”—whose per- 
formance is inconsequential to the team’s out- 
put.) 


Determining Feedback Parameters and 
Schedules of Reinforcement 


The conditions under which a team mem- 
ber would experience a particular feedback 
parameter was a function of both the feed- 
back linkages (AR, AN, etc.) he would ex- 
‘perience, based on the arrangement of the 
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team he was in, and his own and his partner’s 
Proficiency, or skill as a team member. The 
probability of correct initial team perform- 
ance was estimated for series teams by the 
multiplication theorem of probability (X) (Y), 
and for parallel teams by the addition 
theorem of probability, X + Y — (X JO 
where X and Y are the performance profi- 
ciencies of the two team members, respec- 
tively. Knowing these values, it was possible 
to calculate the feedback parameter (the 
probability that each of the four feedback 
linkages would occur). The schematic for 
performing this operation is outlined in Table 
1. On the basis of this feedback parameter, 
estimates of schedules of reinforcement were 
derived. Unlike conventional schedules of re- 
inforcement (e.g., Ferster & Skinner, 1957), 
however, the ones obtained in this study also 
contained a provision for taking into ac- 
count reinforcement for both correct and in- 
correct responses (AR and IR conditions, re- 
spectively), as well as nonreinforcement for 


TABLE 1 
UsiNG TEAM MEMBER PERFORMANCE PROBABILITIES TO EsTIMATE THE FEEDBACK PARAMETER 


D IN SERIES, PARALLEL, AND 


"INDIVIDUAL" TEAM ARRANGEMENTS 


Reinforce-| Series team 


ment 


Parallel team 


P (correct team performance) = (X; )(Y) 


Response 
Incorrect 


Correct 


Present 


Absent 


P (correct team performance) =X +Y — (X)(V) 
Response 
Incorrect 


Correct. 


“Individual” Team 


Preselected member 
P (correct team performance) =X 


Response 


Correct Incorrect 


Partner 


Response 


Correct Incorrect 


based on finding the feedback parameter for Member 1, with performance 
probability- D aa and Y tod ind feedback parameter for Member 2. In “individual” team, Member 1 has been ar- 


bitrarily designated the preselected member. 
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both correct and incorrect performance (IN 
and AN conditions, respectively). By so do- 
ing, it was hypothesized that when initial ex- 
pected schedules of reinforcement were rank 
ordered in terms of the percentage of the time 
correct responses would be reinforced and in- 
correct responses would be nonreinforced, in- 
dividual performance of the members (and 
subsequently of the team as a unit) might be 
predicted. The order in which the members 
were predicted to perform, from best to poor- 
est, were; preselected members of the “indi- 
vidual” teams, series team members, parallel 
team members, and partners of the prese- 
lected “individual” team members. This or- 
der was arrived at by examining the nature 
(and secondarily, the frequency) of the kinds 
of feedback an individual might expect as he 
performed in the particular arrangement sit- 
uation, taking into account his own, and his 
partner’s, proficiency, or skill. 


METHOD 
Apparatus 


A more detailed description of the apparatus used 
may be found in Egerman et al. (1962). Each sub- 
ject was seated behind his own control panel which 
contained: three lights used to present four different 
stimulus patterns; a counter and feedback light 
operating in parallel, indicating individual correct 
responses; and a bat-levered spring-release switch, 
by which the subject made an appropriate timing 
response. Obvious to both subjects was a large wall 
counter, used to indicate total number of team re- 
sponses. A multiple-pen recorder provided a record 
of the rate of stimulus presentations, individual cor- 
rect responses, and correct team responses. Subjects 
were trained to perform two timing responses; a 
2-second one, which was allowed to vary + .183 
second, and a 4-second response (+ .258 second). 


» 


Design and Procedure 


Two college-aged subjects in each of 18 teams 
received 2 hours of training daily for 11 days. Each 
pair was trained separately in two major phases: 
preteam and team training. During preteam training 
subjects first practiced making timing responses of 
2 and 4 seconds for 20 5-minute periods; all correct 
responses were followed by appropriate feedback via 
each subject's panel counter and feedback light. 
Then the subjects learned to associate two stimulus 
light patterns with each press; these appeared in 
random order except no two identical stimuli fol- 
lowed each other in immediate succession (20 5-minute 
periods). Again, all correct responses to the stimu- 
lus presentations were followed by individual feed- 
back. In the final segment of preteam training, sub- 
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jects were paced (50 5-minute periods), responding 
with one press to each pattern. (The average interval 
between presentations was 18.5 seconds; the range 
being 10-27 seconds.) During the last 2 5-minute 
periods of this training, the ratio of each team 
member's correct responses to total responses made 
during this 10-minute period was calculated. This 
measure of his proficiency was then used as the 
basis for systematic assignment of the six pairs of 
subjects to each of the three arrangement condi- 
tions, so that the average proficiency and variability 
of the team members in one arrangement was not 
significantly different from these same measures in 
the other arrangements. 

In the next major training phase, team training, 
each pair of subjects was verbally instructed to work 
as a team and attempt to score points on the wall 
counter, The subjects were informed that points 
could not be registered until both members had 
completed their response. As soon as the team scored 
a point, a new pattern was immediately presented. 
If the team failed to register a point, the next pat- 
tern was presented after an interval (averaging 27 
seconds; the range being 19-36 seconds). The sub- 
jects were informed that their only feedback would 
be the registering or not registering of points on 
the wall counter, since the individual counters and 
feedback lights on the individual panels were made 
inoperable. The subjects were not informed as to 
the arrangement in which they would be perform- 
ing. (With regard to this point, it was interesting to 
note that none of the 36 subjects in over 10 hours of , 
team performance guessed that he and his partner 
might not be working as a "team," as Was true in 
the parallel and individual team arrangements.) 
Team training lasted for 130 5-minute periods. 
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RESULTS | 


The data obtained in the present stuc 
were analyzed in terms of the effects of tean 
arrangement on individual and team perform : 
ance, Team arrangement was described in’ 
terms of the various feedback parameters €x- 1 3 
isting for team members; these in turn, were | 
hypothesized to affect the accuracy of the | 
performance of the team members and, subse- 
quently, of the team. 


Individual Preteam Proficiency and Initial 
Team Performance 


Measures of preteam proficiency were €n- 
tered into the appropriate algebraic expres 
sion to predict team performance for all 18 
teams. These were compared to the observed 
team proficiency values—empirical measures 
of performance for each of the 18 teams dur- 4 
ing the initial 10 5-minute periods of team 
training. The Spearman rank-order correla- 
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tion was .706 (p< .01), clearly indicating 
"that preteam individual member proficiencies, 
properly related by probability theory, reflect 
fairly accurate predictions of initial team 
performance. 

An analysis of variance was performed on 
the proficiency data (subject to a square-root 
transformation) comparing team arrange- 
ments and proficiency estimates. The arrange- 
ment variable comprised the series, parallel, 
and individual conditions, while the profi- 
ciency variable comprised predicted and ob- 
served proficiency. It was observed that the 
groups differed substantially on the basis of 
team arrangement (F = 1144, df=2, p< 
.01) but there was no evidence of a difference 
between predicted and observed performance. 


. The interaction effect was not significant. 


Estimating Schedules of Reinforcement 


The schedule of reinforcement for each 
member was estimated from the team arrange- 


` ment to which he was assigned and from the 


proficiencies of each team pair. Of the four 
different kinds of reinforcement schedules, it 
was suggested that those members who would 
experience the most favorable schedules 
would be the preselected members in the indi- 
vidual team arrangement (receiving AR and 
AN), followed by the series team members 
(AR, AN, and IN), then the parallel team 
members (AR, AN, and IR), and finally the 
partners of the preselected members of the 
individual team arrangement (AR, AN, IN, 
and IR). The rank ordering of these sched- 
ules, therefore, was first based on the quali- 


-tative schedule, and then, within each quali- 


tative schedule, on the basis of a quantitative 
schedule. Thus, all of the preselected members 
in the individual teams were considered as 
having an equally favorable performance 
climate, since all were on a continuous sched- 
ule of reinforcement. In the series arrange- 
ment that member whose chances of getting 
the highest percentage of correct responses 
reinforced was considered to have the best 
performance climate. In the parallel arrange- 
ment, that member who had the lowest chance 
of being reinforced for an incorrect response 
was likely to show least performance decre- 
ment, Finally, the partner of the preselected 
member of the individual arrangement who 
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had the highest ratio of reinforcements-for- 
correct-responses to reinforcements-for-incor- 
Tect-responses was considered to have the 
most optimal performance climate of members 
in this condition. 

These schedules, when rank ordered and 
compared with the rank order of the total 
number of correct responses each team mem- 
ber was observed to make, showed a Spear- 
man correlation of .201 (p > .05). The mag- 
nitude of the correlation between rank-or- 
dered reinforcement schedules and percentage 
of correct responses (proficiency) was 444, 
which is statistically different from zero (p 
< .01). Thus, there does not appear to be a 
Significant relationship between the total 
number of responses made by each member 
and his schedule of reinforcement; however, 
when this figure is transposed into the pro- 
portion of correct responses, a significant re- 
lationship was found to exist, This difference 
was attributable to differences in the rate 
with which differently arranged teams per- 
formed. Parallel teams, in general, were ob- 
served to respond mést rapidly because only 
one member was required to perform cor- 
rectly. This increased the probability of the 
team's getting a correct response, and conse- 
quently resulted in the presentation of another 
response opportunity for both members more 
rapidly. Series team members, generally, 
showed lowest performance rates. Individual 
team rates of pacing usually fell between these 
two. Correcting for this differential rate of 
responding among teams reveals that there is 
no significant relationship between schedule 
of reinforcement and rate of responding, but 
that schedules of reinforcement are signifi- 
cantly related to the percent accuracy of 
performance, a measure obtained when a 
correction for rate is made. 


Initial and Final Team Performance 
Compared 


Finally, it was hypothesized that, on the 
basis of the arrangement and attendant feed- 
back to the various team members, series 
teams would show a performance increment 
from the initial to the final periods of team 
training, parallel teams a decrement from the 
initial to the final periods, and individual 
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TABLE 2 


INITIAL AND FINAL AVERAGE TEAM PROFICIENCIES, DIFFERENCES, AND STATEMENT OF THE SCHEDULE 
OF REINFORCEMENT SERIES, PARALLEL, AND INDIVIDUAL TEAM ARRANGEMENTS 


5-minute periods 


Team arrangement Difference Reinforcement schedule 
Initial 10 Final 10 
Seri 46. 48.83 2.15 Aperiodic reinforcement for correct responses — 
m 459^ 5 AS d Centintous reinforcement for correct; aperiodic for 
incorrect responses 
Individual (prese- 68.67 70.33 2.50 | Continuous reinforcement for correct responses 4 
lected) partner 64.33 38.50 —25.66 | Aperiodic reinforcement for both correct and in- 


Correct responses 


teams an increment, As for the partners of the 
preselected members of the individual team 
arrangement, these subjects, whose perform- 
ance did not contribute to the team output, 
were expected to show a decided performance 
decrement from the initial to the final pe- 
riods of team performance. 

Table 2 presents the mean team proficien- 
cies (correct team responses divided by at- 
tempts) for the three team arrangements dur- 
ing the first block of 10 5-minute periods 
(Periods 1-10) and during the last 10 5- 
minute periods (Periods 121-130), permit- 
ting a comparison between the first and the 
last performance periods. The differences be- 
tween these values and a description of the 
reinforcement schedule also appear in Table 
2. The Walsh test (Siegel, 1956) revealed that 
the difference in series team performance and 
the difference in the individual team's per- 
formance (preselected member) were not sig- 
nificant. 

The data were further examined to deter- 
mine whether or not the series team members 
and the preselected members had been per- 
forming at a stable level during their pre- 
team training; if this were so, significant in- 
creases in subsequent team performance 
would have been highly improbable. This as- 
sumption of performance at a stable level was 
tested by averaging the performance of the 
respective team members during the last 30 
5-minute periods of preteam training, in or- 
der to ascertain if there was an obvious trend 
in performance, or if indeed some near stable 
level had already been attained. These data 
revealed that a stable performance already 
had been achieved by the series teams, thus 


accounting for the lack of significant im- 
provement in subsequent team training from 
the initial to the final periods of team per- 
formance, The data also showed, however, 
that the performance of the preselected mem- 
bers of the individual teams had not attained 
stability; why these members should mot have 
shown an improvement remains unexplained. 
The decrement observed in the other two con- 
ditions (parallel teams and partners of pre- 
selected members of individual teams) from 
the initial to the final periods of training was 
significant (p = .016). ‘ 
In order to examine team performance 
characteristics from the data throughout the 
entire 130 periods of training, an analysis of 
variance was performed, using team arrange- 
ment and periods of practice as the major 
variables; the periods-of-practice variable was 
broken down into 13 blocks of 10 5-minute 
periods each. The dependent variable used in 
this analysis was the total number of team 
points obtained by each team in each ar- 
rangement during the respective practice 
periods. The results indicated a significant 
effect due to arrangement (F — 8.17, df — 2, 
#<.01), and no significant effect due to 
periods of training. The interaction between 
these two variables was significant (F = 1.65, 
df = 24, p < .05) indicating that, depending 
on the type of team arrangement, a change in 
performance does occur over the total num- 
ber of practice periods; on the basis of à 
comparison between the initial and final 
periods of team performance, the series and 
individual teams did not show significant per- 
formance increments, but the parallel teams 
showed significant performance decrements. 
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CONCLUSION 


Team performance showed those character- 
istics which were hypothesized to occur on 
the basis of the performance of the team 
members. Performance of team members, in 
turn, was found to be a function of the 
schedules of reinforcement set up for them 
by the team environment, Such schedules are 
a function of the extent to which feedback 
linkages permit reinforcement for both cor- 
rect and incorrect responses, In addition, the 
nature of the feedback linkages for a team 
member was shown to depend on the arrange- 
ment of the team members, 

On the basis of the obtained results, the 
following statements appear warranted: 

l. Preteam measures of individual pro- 


* ficiency, when entered into appropriate prob- 


ability formulae based on the manner in 
Which the individuals are to be related as 
team members, provide accurate predictions 


- of the initial performance of the team. Series 


team performance can be described by the 
addition theorem of probability; parallel team 
performance by the multiplicative theorem 
(assuming independent performance); and 
individual team performance is based on the 
preselected member's preteam proficiency. 

2. Feedback linkages, defined in terms of 
the appropriateness of team output to team 
member response, may be observed as follows 
in the different team arrangements: AR oc- 
curs when team reinforcement follows correct 
individual performance; AN occurs when no 
team reinforcement follows an incorrect indi- 
vidual response; IN occurs (except in paral- 


“lel teams) when no team reinforcement fol- 


lows a correct individual response; and IR 
Occurs (except in the series teams and the 
preselected member of the individual teams) 
when team reinforcement follows incorrect 
individual performance. 

3. Feedback parameters may be calculated 
by estimating the probability that any indi- 
vidual will experience one or more of the 
various feedback linkages; taking the ratio 
of his correct (or incorrect) performance 
probability to the team's correct (or incor- 
rect) performance probability provides the 
appropriate values for AR, AN, IR, and IN. 
An individual's performance probability is 
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based on his own proficiency; a team’s per- 
formance probability is based on the profi- 
ciency of both members, and on the team 
arrangement. 

4. Schedules of reinforcement estimated to 
exist during team performance may be calcu- 
lated on the basis of individual proficiencies 
and the assignment of team members to a 
particular arrangement. These schedules may 
be ordered from most to least favorable as 
follows: continuous reinforcement for correct 
performance (preselected members of the in- 
dividual arrangement); aperiodic reinforce- 
ment for correct performance (for members 
in the series arrangement); continuous rein- 
forcement for correct performance, but aperi- 
odic reinforcement for incorrect performance 
(parallel team members); and aperiodic rein- 
forcement for both correct and incorrect per- 
formance (partners in the individual team 
arrangement), 

5. Changes from initial to final team per- 
formance may be anticipated on the basis of 
an examination of the schedules of reinforce- 
ment existing for tedm members as they per- 
form in various team arrangements. Since 
series teams showed no performance decre- 
ment from the initial to the final periods of 
team performance, the aperiodic schedules of 
reinforcement present apparently are suffi- 
cient to maintain stable levels of performance 
in these members, and hence, in the perform- 
ance of the team. Parallel teams show a dec- 
rement; the schedules of reinforcement for 
incorrect responses under which these mem- 
bers perform result in individual, and conse- 
quently team, performance decrements. The 
individual team arrangements show no per- 
formance decrement from the initial to the 
final periods of team performance; the con- 
tinuous schedules of reinforcement for cor- 
rect responses under which the preselected 
members perform appear conducive to per- 
formance maintenance in these team mem- 
bers. 

The approach taken here may be con- 
trasted with some of the early small group 
research studies of Bavelas (1950) and 
Leavitt (1951), in which “structure” played 
an important part. In those studies, the con- 
necting links between members were spoken 
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of in terms of communication channels, and 
not feedback channels. Those authors studied 
the effects of arrangement or structure on team 
performance and used, as one of their princi- 
pal dependent variables, the speed with which 
the variously structured teams learned the 
solution to the problem. 

The question might be raised as to whether 
there is a difference between what has been 
termed “communication channel,” and what 
in this study is termed “feedback parameter.” 
Probably, these are very similar. Various com- 
munication channels permit appropriate in- 
formation, or feedback, to reach the respec- 
tive team members with a certain probability, 
which varies with the structure of the team. 
Thus, Bavelas reports a study by Smith indi- 
cating that members arranged in a “chain” 
performed more accurately than members 
structured as a “circle.” This was also con- 
firmed in the Leavitt study and in one by 
Heise and Miller (1951), lending support to 
the concept that various team structures with 
their different attendant communication chan- 
nels do affect team performance differentially. 
However, these early studies had little to say 
in the way of a priori predictions of team 
performance as a function of structure. At 
least the predictions were not based on rig- 
orous learning-theoretic concepts which would 
readily permit us to transfer predictions of 
team performance to other kinds of team 
structures. Thus, even though channels of 
communication and feedback channels may be 
synonymous, the present work leads to a more 
generalizable model for group studies, 

It is clearly apparent that the study of 
groups requires a knowledge of individual 
performance parameters. But, on the other 
hand, whenever individuals are assembled into 
groups, a “group environment” is created 
which establishes certain feedback parameters 
for the members. The advantage of the par- 
ticular method of group performance analysis 
introduced in the present study is that the 
nature of this “group” environment may be 
hypothesized a priori, knowing certain pa- 
rameters of individual performance and the 
type of group arrangement. 

The team paradigms used in the present 
study, and in the studies conducted by Za- 
jonc and his associates, may be contrasted 
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with the model developed by Lorge and Solo- 
mon (1955, 1962) and Tuckman and Lorge 
(1962) describing group  problem-solving 
tasks. The principal distinction lies in the 
ability of the team paradigms to estimate in- 
dividual performance, to study its rate of 
development, and to empirically account for 
changes in it as a result of the interaction of 
the group members, The effort in the two ap- 
proaches is quite congruent in that they both 
attempt to account for the probability of suc- 
cessful “group” performance, even though one 
is a loosely structured group, and the other is 
a highly formalized team. Two advantages 
seem to accrue to those working with the 
highly structured team approach: first, the 
subject's task usually has an underlying con- 
tinuous distribution, that is, individuals can 
perform it with a probability extending from 
.00 to 1.00, Subjects, then, may be assembled 
so that expected team performance across 
teams may vary continuously. In the group 
problem-solving situation, an individual is 
known either to be able to solve some portion 
of the problem (p = 1.00) or he cannot (p= 
00). The probability that a group will solve 
a problem, therefore, has a multinomial dis- 
tribution, depending upon the number of 
stage-wise solutions to the problem. Deter- 
mining what the probability is that a particu- 
lar group will solve a particular problem, is 
then, a rather arduous task, since an estimate 
of the proportion of individuals who can solve 
each successive phase of the problem must 
first be estimated. The second advantage in 
dealing with a highly structured team where 
the individuals have known probabilities of 
performance is its amenability to investigat- 
ing the effect of the group environment on 
any individual's performance; that is, the re- 
sults of interaction between members are 
more apparent. However, the two approaches 
are quite similar; the group problem-solving 
paradigm will be identical to the team situa- 
tion when the underlying continuous nature 
of problem-solving tasks have been further 
explored. For example, the degree to which 
an individual is familiar with a problem seems 
to facilitate its solution by the individual or 
the group of which he is a part (Lorge & 
Solomon, 1959). 

In this study, the utilization of member 


TEAM PERFORMANCE 


. proficiency and team arrangement to predict 


initial team performance was found to be 
feasible for three different types of two-man 
arrangements—series, parallel, and individual. 
Since this method was found to be effective 
for these arrangements, it is probably appli- 
cable to a wide range of other arrangements, 
with larger numbers of team members, where 
reliable preteam performance measures and 
knowledge of the manner in which these are 
to be combined are available. What appears 
to be important in this particular extension of 
the present approach is not how large the 
team is, but rather, what kinds of feedback 
parameters exist as a function of the profi- 
ciencies and arrangement of team members. 
Thus, if comparably proficient individuals are 


. added in series to an already existing series 


team which performs at some observed level, 
then the addition of these members will re- 
duce the proficiency of the team. On the 
other hand, increasing the number of parallel 
members in a team would increase the prob- 
ability of correct team performance, but also 
would result in an increase in the probability 
that incorrect performance for these members 
would be reinforced. Calculating the feed- 
back contingencies for a “mixed” series- 
parallel team of N members would require 
only slightly more lengthy probability form- 
ulae than used for the “pure” series or paral- 
lel teams. Again, though, the important fea- 
ture would be the kinds and the frequency of 
feedback which the team members might re- 
ceive as components of any particular team 
arrangement. 

In a related effort, Smoke and Zajonc 
(1962) have developed a rather sophisticated 
model of decision-making characteristics of 
informal groups. They propose that the effec- 
tiveness of the group’s output depends on 
both the type of group structure which exists 
(ranging from dictatorship, oligarchy, una- 
nimity, fixed, quorum, minimal quorum, to 
independent group) and the probability (2) 
that a given individual is correct in his 
knowledge of the solution to the problem. 
By plotting 4(p), a function determining the 
probability of a correct group response (which 
is unique for each group mentioned) against 
2, one may observe and compare the various 
group performance capabilities. As defined by 
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these authors, it is apparent that the "series" 
team in the present study corresponds to their 
“unanimity” group; the “parallel” team to 
the “minimal quorum” group; and the “indi- 
vidual” team to the “dictatorship.” At all 
points on the ? axis for these three groups 
(except p = 1.00) is interesting to note that 
the probability of group success falls in the 
order, from most likely to least likely, mini- 
mal quorum (parallel), dictatorship (indi- 
vidual), unanimity (series), the same order 
observed in the present study. 

The results of the present study have indi- 
cated that team performance may be consid- 
ered in those terms which have been used to 
describe the performance of individuals; that 
is, schedules and contingencies of reinforce- 
ment. Knowing these, a detailed description 
of team performance becomes feasible, as 
demonstrated in the present study. In addi- 
tion, this study presents a general learning- 
theoretic framework within which future re- 
search might be conducted in the area of 
multiman performance. 

. 
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4 modes of reacting to the late adolescent identity crisis were described, meas- 
ured, and validated. Criteria for inclusion in 1 of 4 identity statuses were the 
presence of crisis and commitment in the areas of occupation and ideology. 


Statuses were determined for 86 college 


male Ss by means of individual inter- 


views. Performance on a stressful concept-attainment task, patterns of goal 
Setting, authoritarianism, and vulnerability to self-esteem change were de- 
pendent variables. Ss higher in ego identity performed best on the concept- 
attainment task; those in the status characterized by adherence to parental 
wishes set goals unrealistically high and subscribed significantly more to au- 
thoritarian values. Failure of the self-esteem condition to discriminate among 
the statuses was attributed to unreliability in self-esteem measurement, 


Ego identity and identity diffusion (Erik- 
son, 1956, 1963) refer to polar outcomes of 
the hypothesized psychosocial crisis occurring 
in late adolescence. Erikson views this phase 
of the life cycle as a time of growing occupa- 
tional and ideological commitment, Facing 
such imminent adult tasks as getting a job 
and becoming a citizen, the individual is 
required to synthesize childhood identifica- 
‘tions in such a way that he can both establish 
a reciprocal relationship with his society 
and maintain a feeling of continuity within 
himself. 

Previous studies have attempted to deter- 
mine the extent of ego-identity achievement 
by means of an adjustment measure and 
the semantic differential technique (Bronson, 
1959), a Q-sort measure of real-ideal-self dis- 


. crepancy (Gruen, 1960), a measure of role 


variability based on adjective ranking (Block, 
1961), and a questionnaire (Rasmussen, 


1This paper is based in part on a doctoral dis- 
sertation submitted to the Ohio State University 
(Marcia, 1964). 3 
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research. The author also wishes to thank D. P. 
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concerning portions of the final manuscript. To the 
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ticularly to M. B. Herzbrun who served as research 
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Parts of the research were reported in a paper 
read at the 1965 Midwestern Psychological Associa- 
tion convention. . 
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1964). While these studies have investigated 
self-ratings on characteristics that should fol- 
low if ego identity has been achieved, they 
have not dealt explicitly with the psychosocial 
criteria for determining degree of ego identity, 
nor with testing hypotheses regarding direct 
behavioral consequepces of ego identity. 

To assess ego identity, the present study 
used measures and criteria congruent with 
Erikson’s formulation of the identity crisis 
as a psychosocial task. Measures were a semi- 
structured interview and an incomplete-sen- 
tences blank. The interview (see Method 
section) was used to determine an indi- 
vidual’s specific identity status; that is, 
which of four concentration points along a 
continuum of ego-identity achievement best 
characterized him, The incomplete-sentences 
blank served as an overall measure of identity. 
achievement. The criteria used to establish 
identity status consisted of two variables, 
crisis and commitment, applied to occupa- 
tional choice, religion, and political ideology. 
Crisis refers to the adolescent's period of 
engagement in choosing among meaningful 
alternatives; commitment refers to the degree 
of personal investment the individual exhibits. 

"Identity achievement" and "identity dif- 
fusion" are polar alternatives of status in- 
herent in Erikson's theory. According to the 
criteria employed in this study, an identity- 
achievement subject has experienced a crisis 
period and is committed to an occupation and 
ideology. He has seriously considered several 
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occupational choices and has made a decision 
on his own terms, even though his ultimate 
choice may be a variation of parental wishes. 
With respect to ideology, he seems to have 
reevaluated past beliefs and achieved a reso- 
lution that leaves him free to act. In general, 
he does not appear as if he would be over- 
whelmed by sudden shifts in his environment 
or by unexpected responsibilities. 

The identity-diffusion subject may or may 
not have experienced a crisis period; his hall- 
mark is a lack of commitment. He has neither 
decided upon an occupation nor is much con- 
cerned about it. Although he may mention a 
preferred occupation, he seems to have little 
conception of its daily routine and gives the 
impression that the choice could be easily 
abandoned should opportunities arise else- 
where. He is either uninterested in ideological 
matters or takes a smorgasbord approach in 
which one outlook seems as good to him as 
another and he is not averse to sampling 
from all. 

Two additional concentration points 
roughly intermediate in “this distribution are 
the moratorium and foreclosure statuses, The 
moratorium subject is in the crisis period 
with commitments rather vague; he is dis- 
tinguished from the identity-diffusion subject 
by the appearance of an active struggle to 
make commitments. Issues often described as 
adolescent preoccupy him. Although his par- 
ents’ wishes are still important to him, he is 
attempting a compromise among them, soci- 
ety’s demands, and his own capabilities. His 
sometimes bewildered appearance stems from 
his vital concern and internal preoccupation 
with what occasionally appear to him to be 
unresolvable questions. 

A foreclosure subject is distinguished by 
not having experienced a crisis, yet express- 
ing commitment. It is difficult to tell where 
his parents’ goals for him leave off and where 
his begin. He is becoming what others have 
prepared or intended him to become as a 
child. His beliefs (or lack of them) are virtu- 
ally “the faith of his fathers living still." 
College experiences serve only as a confirma- 
tion of childhood beliefs. A certain rigidity 
characterizes his personality; one feels that 
if he were faced with a situation in which 
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parental values were nonfunctional, he would 
feel extremely threatened. 

Previous studies have found ego identity 
to be related to “certainty of self-conception” 
and “temporal stability of self-rating” (Bron- 
son, 1959), extent of a subject's acceptance 
of a false personality sketch of himself 
(Gruen, 1960), anxiety (Block, 1961), and 
sociometric ratings of adjustment (Rasmus- 
sen, 1964), Two themes predominate in these 
studies: a variability-stability dimension of 
self-concept, and overall adjustment. In gen- 
eral, subjects who have achieved ego identity 
seem less confused in self-definition and 
are freer from anxiety. 

Four task variables were used to validate 
the newly constructed identity statuses: a 
concept-attainment task administered under 
stressful conditions, a level of aspiration 
measure yielding goal-setting patterns, a 
measure of authoritarianism, and a measure 
of stability of self-esteem in the face of 
invalidating information. 

The hypotheses investigated were these: 

1. Subjects high in ego identity (i.e. 
identity-achievement status) will receive sig- 
nificantly lower (better) scores on the stress- 
ful concept-attainment task than subjects 
lower in ego identity. Subjects who have 
achieved an ego identity, with the internal 
locus of self-definition which that implies, 
will be less vulnerable to the stress condi- 
tions of evaluation apprehension and over- 
solicitousness (see Method section). 

2. Subjects high in ego identity will set 
goals more realistically than subjects low in 
ego identity on a level of aspiration measure. 
The increment to overall ego strength follow- 
ing identity achievement should be reflected 
in the ego function of reality testing. F 

3. Subjects in the foreclosure status will 
endorse “authoritarian submission and con- 
ventionality" items to a greater extent than 
subjects in the other statuses. 

4. There will be a significant positive rela- 
tionship between ego identity measures and a 
measure of self-esteem. 

5. Subjects high in ego identity will change 
less in self-esteem when given false informa- 
tion about their personalities than subjects 
low in ego identity. 
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6. There will be a significant relationship 
between the two measures of ego identity: 
the identity-status interview and the incom- 
plete-sentences blank, 


METHOD 
Subjects 


Subjects were 86 males enrolled in psychology, 
religion, and history courses at Hiram College. 


Confederate Experimenters 


Due to the possibility of contamination by subject 
intercommunication on a small campus, the study 
employed 10 confederate (task) experimenters who 
administered the concept-attainment task in one 
12-hour period to all subjects. These task experi- 
menters, 7 males and 3 females, were members of 
the author’s class in psychological testing and had 
taken three or more courses in psychology. They 
had previously assisted in a pilot study and had 
been checked twice by the author on their experi- 
mental procedure. The use of a sample of experi- 
menters, none of whom were aware of the subjects’ 
standings on crucial independent variables, also has 
advantages in terms of minimizing the effects of 
experimenter bias (Rosenthal, 1964). 

Identity status. Identity status was established by 
means of a 15-30 minute semistructured interview. 
All interviews followed the same outline, although 
“deviations from the standard form were permitted 
in order to explore some areas more thoroughly. In 
most cases, the criteria for terminating an interview 
involved the completion of the prescribed questions 
as well as some feeling of certainty on the inter- 
viewer’s part that the individual had provided 
enough information to be categorized. Interviews 
were tape-recorded and then replayed for judging. 
Hence, each interview was heard at least twice, 
usually three or four times. 

A scoring manual (Marcia, 1964) was constructed 
using both theoretical criteria from Erikson and 
empirical criteria from a pilot study. Each subject 
was evaluated in terms of presence or absence of 
crisis as well as degree of commitment for three 
areas: occupation, religion, and politics—the latter 
two combined in a general measure of ideology. 
The interview judge familiarized himself with the 
descriptions of the statuses provided in the manual 
and sorted each interview into that pattern which 
it most closely resembled. Analysis of interjudge 
reliability for the identity statuses of 20 randomly 
selected subjects among three judges yielded an 
average percentage of agreement of 75. One of the 
judges was essentially untrained, having been given 
only the scoring manual and the 20 taped interviews. 

A sample question in the occupational area was: 


How willing do you think you'd be to give ap 
going into if something better came along? 
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Examples of typical answers for the four statuses 
were: 


[Identity achievement] Well, I might, but I 
doubt it. I can’t see what “something better” 
would be for me. 

[Moratorium] I guess if I knew for sure I 
could answer that better. It would have to be 
something in the general area—something related. 

[Foreclosure] Not very willing. It’s what I’ve 
always wanted to do. The folks are happy with it 
and so am I. 

[Identity diffusion] Oh sure. If something better 
came along, I'd change just like that. 


A sample question in the religious area was: 


Have you ever had any doubts about your re- 
ligious beliefs? 

[Identity achievement] Yeah, I even started 
wondering whether or not there was a god. I've 
pretty much resolved that now, though. The way 
it seems to me is.... 

[Moratorium] Yes, I guess I’m going through 
that now. I just don't see how there can be a god 
and yet so much evil in the world or... . 

[Foreclosure] No, not really, our family is 
pretty much in agreement on these things. 

[Identity diffusion] Oh, I don’t know. I guess 
so. Everyone goes through some sort of stage like 
that. But it reallyedoesn’t bother me much, I 
figure one’s about as good as the other! 


Overall ego identity. The Ego Identity Incomplete 
Sentences Blank (EI-ISB) is a 23-item semistruc- 
tured projective test requiring the subject to com- 
plete a sentence “expressing his real feelings” having 
been given a leading phrase. Stems were selected and 
a scoring manual designed (Marcia, 1964) according 
to behaviors which Erikson (1956) relates to the 
achievement of ego identity. Empirical criteria were 
gathered during a pilot study. Each item was scored 
3, 2, or 1 and item scores summed to yield an 
overall ego-identity score. Two typical stems were: 
If one commits oneself ———, and, When I let 
myself go I——. Scoring criteria for the latter stem 
are: 


3—Nondisastrous self-abandonment. Luxuriating 
in physical release. For example, have a good 
time and do not worry about others’ thoughts and 
standards, enjoy almost anything that has laughter 
and some physical activity involved, enjoy myself 
more. 

2—Cautiousness, don't know quite what will 
happen, have to be careful. Defensive or trivial. 
For example, never know exactly what I will say 
or do, sleep, might be surprised since I don't 
remember letting myself go. 

1—Goes all to pieces, dangerous, self-destructive, 
better not to. For example, think I talk too 
much about myself and my personal interests, 
tend to become too loud when sober and too 


554 James E. 


melodramatic when drunk, sometimes say things 
I later regret. 


Analysis of interscorer reliability for 20 protocols 
among three judges yielded an average item-by-item 
correlation of #=.76, an average total score cor- 
relation of 7=.73, and an average percentage of 
agreement of 74. 


Measures of Task Variables 


Concept Attainment Task performance. The Con- 
cept Attainment Task (CAT) developed by Bruner, 
Goodnow, and Austin (1956) and modified by Weick 
(1964), requires the subject to arrive at a certain 
combination of attributes of cards. The subject may 
eliminate certain attributes by asking whether a card 
is positive or negative for the concept and he may 
Euess the concept at any time. He is penalized 5 
points for every request, 10 points for every guess, 
and 5 points for every 30 seconds that passes before 
he attains the concept. Level of aspiration was ob- 
tained by informing the subject of his previous time 
and asking him to estimate his time on the next 
problem. 

Quality of performance on the CAT was assessed 
by the following measures: overall CAT Scores 
(points for time plus points for requests and 
guesses), points for time alone, points for requests 
and guesses alone, number of “give-ups” (problems 
which the subject refused to complete). The main 
level of aspiration measure was attainment discrep- 
ancy or D score, the algebraic average of the differ- 
ences between a subject's stated expectancy for a 
problem and his immediately preceding performance 
on a similar problem. 

A combination of two stress conditions (stress 
defined here as externally imposed conditions which 
tend to impair performance) were used: evaluation 
apprehension and oversolicitousness. Evaluation ap- 
prehension refers to a subjects feeling that his 
Standing on highly valued personal characteristics 
is to be exposed. The characteristic chosen for this 
study was intellectual competence, unquestionably 
salient for college students. Oversolicitousness was 
chosen as a logical complement to evaluation appre- 
hension. It was assumed that unnecessary reassurance 
would validate and, hence, augment whatever anxiety 
the subject was experiencing. 

Pilot study data indicated that the stress condi- 
tions were effective. Using the same task experi- 
menters as in the final study, 56 subjects (27 males 
and 29 females) took the CAT under stress and 
nonstress (ie. stress omitted) conditions. Each ex- 
perimenter ran about 3 stress and 3 nonstress sub- 
jects. Stressed subjects performed significantly more 
poorly than nonstressed ones (t—2.61, df— 54, 
$ < 02). 

Self-esteem change and authoritarianism. The Self- 
Esteem Questionnaire (SEQ-F) is a 20-item test 
developed by deCharms and Rosenbaum (1960) on 
which the subject indicates his degree of endorse- 
ment of statements concerning general feelings of 
self-confidence and worthiness. 
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In addition, statements reflecting authoritarian 
submission and conventionality, taken from the Cali- 
fornia F Scale (Adorno, Frenkel-Brunswik, Levinson, 
& Sanford, 1950), which were originally filler items, 
are used here as a dependent variable. The SEQ-F 
was administered twice, the first time in a classroom 
setting, the second, during the experimental situation 
following an invalidated self-definition. 

The treatment condition of “invalidated self- 
definition” (ISD) followed the CAT and directly 
preceded the second administration of SEQ-F. It 
consisted of giving the subject false information 
concerning the relationship between his alleged self- 
evaluation and his actual personality, 


Procedure 


Following is the experimental procedure: Subjects 
completed the EI-ISB and SEQ-F in class. Each 
subject was interviewed to determine his identity 
Status. (This interviewing period lasted about 2 
months.) On the day of the experiment, each sub- 
ject went through the following conditions: (a) 
Administration of the CAT under stress by the 
task experimenter. Evaluation apprehension was 
created by the task experimenter's saying: 


By the way, I thought you might be interested 
to know that this test is related to tests of intel- 
ligence? and that it's been found to be one of 
the best single predictors of success in college. 
So of course, you'll want to do your very best. 


Oversolicitousness was created during CAT perform- 
ance by the task experimenter's hovering over the 
subject, asking him if he were comfortable, advising 
him not to "tense up," not to “make it harder on 
yourself." (b) Following the CAT, the subject was 
seated in the author's office where he was given 
either a positive or negative (randomly assigned) 
invalidated self-definition. The subject found the 
experimenter intently scanning a data sheet and was 
told: 


Ive been looking over some of the data and it 
seems that while you consider yourself less [more] 
mature than other subjects, you actually come out 
as being more [less] mature. Is there any way 
you can account for this discrepancy? [Pause for 
the subject’s response.] This seems to hold up 
also for self-confidence. It seems that you consider 
yourself as having less [more] self-confidence 
than other subjects, yet you actually come out 
having more [less]. 


(c) The subject was then sent to another room 
where he took the SEQ-F for the second time. The 
following day, each subject received a postcard from 
the experimenter explaining the false information. 


?In fact, intelligence test scores gleaned from the 
subjects’ college files did correlate significantly with 
CAT performance (r=.55, df=82, p< .0005). 
However, no significant relationship was found 
between intelligence and identity status. 
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TABLE 1 
DIFFERENCES BETWEEN IDENTITY STATUSES IN CAT PERFORMANCE 
N i M 
time || Sp eiut ESD |M overall chy t 
Identity status 
Identity achievement (A)| 18 18.17 7.94 599.17 186.63 791.94 | 244.15 
Moratorium (B) 22 24.50 15.77 807.14 | 495.58 | 1024.82 | 612.04 
Foreclosure (C) 23 34.20 13.84 875.82 | 285.44 | 1147.83 | 407.98 
Identity diffusion (D) 21 29.73 18.52 767.38 266.43 | 1078.57 | 352.38 
Groups compared 
Time 
A versus D 2.39* 
A versus B +C+D 2.411 
A versus C 2.90% 
A+B + D versus C 2.24% 
Requests + guesses 
A versus D 2.10* 
A versus B +C +D 2.28* 
A versus C 3.474% 
A+B +D versus C 1.69 
Overall score 
A versus D 3.47 
A versus B + C + D 2.45» 
A versus C 3.19 
A +B + D versus C 1.63 
* 
#5 S02. 
erp S01 
y S AU 
RESULTS achievement subjects perform significantly 


Performance on CAT 


The relationship between the identity stat- 
uses and CAT performance was investigated 
by means of individual 7 tests. These are 
found in Table 1 and support the hypothesis 
of significant differences in CAT performance 
between subjects high and low in ego identity. 

For all three indices of CAT performance 
identity-achievement subjects perform signifi- 
cantly ? better than identity-diffusion subjects 
(p’s ranging from .01 to .05), and identity- 


3 All significance levels for t tests are based on 
two-tailed tests. 


better than the other three statuses com- 
bined (7's ranging from .02 to .05). 

Data involving the number of problems on 
which the subjects in the different identity 
statuses gave up are presented in Table 2. 

Comparing identity-achievement subjects 
with other subjects, significantly fewer in- 
stances of giving up on CAT problems are 
found for the identity-achievement subjects. 
This, together with the previous findings 
concerning the relationship between identity 
status and CAT performance under stress, 
provides substantial confirmation of Hypothe- 
sis 1. 


TABLE 2 
NUMBER or CAT PROBLEMS ON WHICH SUBJECTS IN EACH IDENTITY Status Gave Up 


Identity status 
nee dd Moratorium Foreclosure Identity diffusion All other 
achievement 
Give-ups 1 7 13 11 31 
Completions 107 125 131 109 365 
x2 = 893* x = 569 
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An interesting supplementary finding is 
that moratorium subjects were significantly 
more variable in overall CAT scores than 
subjects in the other three statuses combined 
(Fmax = 2.62, df = 21/61, p < .05; see Mc- 
Nemar, 1955, pp. 244-247). 

Correlations between all three CAT per- 
formance measures and the EI-ISB, while in 
the expected direction, failed to reach signifi- 
cance. The Pearson r between overall CAT 
performance and EI-ISB scores was —.14 
(df = 82), 


Level of Aspiration 


The D, or attainment discrepancy score, 
reflects the difference between a subject’s 
aspirations and his actual performance, An 
overall positive D score means that the sub- 
ject tends to set his goals higher than his 
attainment; a negative D score means the 
opposite. 

Inspection of original data revealed that 
no status obtained a negative average D 
score, the range being from 3.60 for identity 
achievement to 5.06 for foreclosure. Analysis 
of variance indicates a significant difference 
among statuses in D score (F = 5.10, df 
= 3/80, p < .01). The ¢ tests presented in 
Table 3 show the foreclosure subjects exhibit- 
ing higher D scores than identity-achievement 
subjects (£— 3.35, df= 38, < 01) and 
higher D scores than the other statuses com- 
bined (¢ —3.70, df= 82, p< 001). It ap- 
Pears that foreclosure subjects tend to main- 
tain high goals in spite of failure. 


TABLE 3 


DIFFERENCES IN D SCORE BETWEEN 
IDENTITY Statuses 


Identity status j 
Identity achievement (A) | 18 
Moratorium (B) 22 
Foreclosure (C) 23 
Identity diffusion (D) 21 


Groups compared 
C versus A 
C versus A +B +D 
B versus A 3 
C +B + D versus A 57 
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TABLE 4 


DIFFERENCES IN F ScoRES BETWEEN 
IDENTITY STATUSES 


N | M SD | t 

Identity status 

Identity achievement (A) | 18 |34.28| 8.99 

Moratorium (B) 23 |37.57| 8.05 

Foreclosure (C) 24 |45.17| 9.01 

Identity diffusion (D) 21 |38.67 | 10.19 
Groups compared 

C versus A 3.88* 

C versus A +B +D 3.75* 

D versus A 44 

B versus A 1.20 

* x .001. 


Authoritarian Submission and Conventional- 
ity (F) 

The ¢ tests presented in Table 4 show 
that foreclosure subjects received significantly 
higher F scores than identity-achievement 
subjects (¢= 3.88, df — 38, p< .001) and 
also significantly higher F scores than the 
other statuses combined (t = 3.75, df = 82, 
$ < .001). 


Self-Esteem 


The significant relationship found here was 
between EI-ISB scores and the initial SEQ 
(r = .26, dí —84, p< 01). No significant 
differences among identity statuses for SEQ 
were found (F —.66, df — 3/82, ns). In 
addition, self-esteem appeared to be unrelated 
to authoritarian submission and convention- 


ality (r = —.03, df = 84, ns) and to CAT 


performance (r — —.03, df = 82, ns). 


Change in SEQ following ISD 


Although differences in the expected direc- 
tion were found (ie., identity achievement 
changed less than identity diffusion), these 
were not significant (t= 1.39, df — 37, f 
X 20). Observer ratings of subjects’ reac- 
tions to the invalidated self-definition indi- 
cated that this treatment condition was 
effective. The failure to obtain significant 
results may have been due to unreliability 
in the self-esteem measure engendered by the 
2-month span between the first and second 
administration. There was a tendency for 
foreclosure subjects given negative informa- 
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TABLE 5 


DIFFERENCES BETWEEN IDENTITY STATUSES 
IN ELISB Scores 


Identity status 


Identity achievement (A) | 18 | 48.28 | 5.10 
Moratorium (B) 23 |48.09 |4.23 
Foreclosure (C) 24 |46.17 | 4.62 
Identity diffusion (D) 21 | 43.33 | 3.52 
Groups compared 

A versus C 1.37 
B versus C 141 
B versus D 3.94* 
A versus D 3.89* 
A +C +B versus D 3.61* 


*p <.001, 


tion to show a greater decrease in self-esteem 
than identity-achievement subjects under simi- 
lar conditions (£ = 2.60, df = 19, p < .02). 

No relationship was found between EI-ISB 
scores and self-esteem change (r = .001, df 
= 84, ns). 


EI-ISB Scores and Identity Status 


Two techniques were employed to assess 
the relationship between overall ego identity 


'as measured by EI-ISB and identity status. 


These were an analysis of variance among 
the four statuses (F = 5.42, df = 3/82, p 
<.01), and ¢ tests among the individual 
statuses, The latter are found in Table 5. 
Identity-achievement subjects received sig- 
nificantly higher EI-ISB scores than did 
identity-diffusion subjects (¢ = 3.89, df = 37, 
P < .001), and the first three identity stat- 


. uses taken together received significantly 


higher EI-ISB scores than did identity dif- 
fusion (f = 3.62, df = 84, p < .001). Thus, 
the distinctive group with respect to EI-ISB 
scores appears to be identity diffusion. These 
findings lend some support to the hypothe- 
sized relationship between overall ego identity 
and identity status. 


Discussion 


Of the two approaches to the measurement 
of ego identity, the interview, based on indi- 
vidual styles, was more successful than the 
incomplete-sentences test, which treated ego 
identity as a simple linear quality. i 

Particularly interésting was the relation- 


6 


ship between such apparently diverse areas as 
performance in a cognitive task and com- 
mitment to an occupation and ideology. The 
interview and the CAT tapped two prime 
spheres of ego function: the intrapsychic, 
seen on the CAT which required the indi- 
vidual to moderate between pressing internal 
stimuli (stress-produced anxiety) and ex- 
ternal demands (completion of the task), 
and the psychosocial, seen in the interview 
which evaluated the meshing of the indi- 
vidual's needs and capabilities with Society's 
rewards and demands, The relationship be- 
tween these two spheres contributes validity 
to both the identity statuses and to the 
generality of the construct, ego. 

No confirmation of the hypothesis relating 
ego identity to resistance to change in self- 
esteem was obtained, possibly because the 
length of time between the first and second 
SEQ administration was 2 months, The vari- 
ability in subjects’ self-esteem over this 
period of time may have obscured differences 
due to treatment alone, 

Following are experimentally derived pro- 
files of each status: 

1, Identity achievement. This group scored 
highest on an independent measure of ego 
identity and performed better than other 
statuses on a stressful concept attainment 
task—persevering longer on problems and 
maintaining a realistic level of aspiration. 
They subscribed somewhat less than other 
statuses to authoritarian values and their self- 
esteem was a little less vulnerable to negative 
information, 

2. Moratorium, The distinguishing features 
of this group were its variability in CAT 
performance and its resemblance on other 
measures to identity achievement, 

3. Foreclosure, This status’ most outstand- 
ing characteristic was its endorsement of 
authoritarian values such as obedience, 
strong leadership, and respect for authority, 
Self-esteem was vulnerable to negative in- 
formation and foreclosure subjects performed 
more poorly on a stressful concept-attainment 
task than did identity-achievement subjects, 
In addition, their response to failure on this 
task was unrealistic, maintaining, rather 
than moderating, unattained high goals, This 
behavior pattern is referred to by Rotter 
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(1954) as “low freedom of movement [and 
is associated with] the achievement of 
superiority through identification [pp. 196- 
197]"—an apt description for one who is 
becoming his parents’ alter ego. 

4. Identity diffusion, While this status was 
originally considered the anchor point for 
high-low comparisons with identity achieve- 
ment, it occupied this position only in terms 
of EI-ISB scores. CAT performance was uni- 
formly poorer than that of identity achieve- 
ment, although not the lowest among the 
statuses. The identity-diffuse individuals to 
which Erikson refers and identity-diffusion 
subjects in this study may be rather different 
with respect to extent of psychopathology. 
A "playboy" type of identity diffusion may 
exist at one end of a continuum and a schizoid 
personality type at the other end. The 
former would more often be found function- 
ing reasonably well on a college campus. 
While having tapped a rather complete range 
of adjustment in the other statuses, the extent 
of disturbance of an extreme identity dif- 
fusion would have precluded his inclusion 
in our sample. Hence, it is the foreclosure, 
and not the identity-diffusion, subject who 
occupies the lowest position on most task 
variables. 

In conclusion, the main contribution of this 
study lies in the development, measurement, 
and partial validation of the identity statuses 
as individual styles of coping with the psycho- 
Social task of forming an ego identity. 


Marcia 
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EXTENSION OF PERSONAL TIME, AFFECTIVE STATES, 
AND EXPECTATION OF PERSONAL DEATH? 


PAUL WOHLFORD 


University of Miami 


An individual’s affective state may causally influence his extension of personal 
time into the future (his protension). The present study tested the hypotheses 
(a) positive affect tends to lengthen protension, and (b) negative affect tends to 
shorten protension. In a pretest-manipulation-posttest design, 147 undergraduate 
men and women were assigned to 1 of 3 affect arousals: anticipating a pleasant 
experience, an unpleasant experience, or personal death. The dependent variable 
of protension was assessed by a personal association (PA) measure and by a 
TAT measure. The hypotheses were clearly supported by the PA data, but not 
by the TAT data. The study demonstrated a mediating mechanism which may 
be in part responsible for earlier observations and findings. 


Many have contended that an individual’s 


' affective state causally influences the length 


of future time encompassed by his cognitions. 
For example, Osgood (1962a) asserted that 
the threat of mass annihilation through 
nuclear war tends to render us incapable of 
acting instrumentally for our own long-range 
interests. He supported this proposition by 
referring to Kéhler’s (1925) experiment in 
which monkeys under stress lost their ability 
to delay gratification. Another example of 
affect influencing temporal experience is in 
the clinical observation that depressives often 
have no apparent future, as for them, only 
doom and disaster lie ahead (Straus, 1947). 

The life of any person, depressive or not, 
may be temporally limited by his anticipa- 
tion of his own death. While death’s in- 
evitability is a definite fact, the exact time 


„of death is very indefinite. Thus, one's 


cognitive and affective expectation of his 


1This study is based on a dissertation presented 
to Duke University in candidacy for the degree of 
Doctor of Philosophy. The reader is referred to the 
dissertation for additional details of the method and 
results (Wohlford, 19652). This study was conducted 
while the writer held a United States Public Health 
Service research fellowship (MH-21, 029-01), A 
Portion of the data in this paper was presented at 
the annual meeting of the Eastern Psychological 
Association, Atlantic City, April 1965. 

The writer is grateful for advice and encourage- 
ment he received from Michael A. Wallach who was 
chairman of the dissertation committee. The writer 
also wishes to thank Robert M. Miller and Hillis 
M. Scribner for their assistance in scoring the 
dependent variable measures. 


personal death may be a significant determi- 
nant of his extension of personal time into 
the future. The primary purpose of the 
present study was to explore this hypothe- 
sized relationship, as well as a more general 
relation between affects and extension, 

This study also had two subsidiary objec- 
tives: to clarify the goncepts and to refine the 
measures of extension of personal time, Terms 
concerning personal time, such as “time per- 
spective,” “time orientation,” etc., are con- 
ceptually imprecise and, frequently, opera- 
tionally misleading. As Wallace and Rabin 
(1960) noted, “the terms included within 
the definition [of ‘time perspective’] require 
more precise specification [p. 230].” 

A recent review (Wohlford, 1965a) sug- 
gested the following concepts: An individual’s 
personal time is the total array of his cogni- 
tions which have referents in the past or 
future. The past-future distinction deter- 
mines temporal direction, The length of 
the time span encompassed by a cognition 
is its extension. Extension into the future 
is protension. Extension into the past is 
retrotension. 

Lewin (1942), Murray (1959), and others 
(see Wohlford, 1965a), have noted that 
psychological research has neglected to study 
cognitions about the future, despite the ap- 
parent importance of anticipatory processes 
for behavior. In light of this deficit, the pres- 
ent investigation devoted primary attention 
to the future. 

Both direct and indirect measures of pro- 
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tension were related to theoretically mean- 
ingful, yet somewhat different variables, as 
reviewed elsewhere (Wohlford, 1965a). Conse- 
quently both the direct and indirect measures 
were used in this experiment. The researches 
on personal time have generally used the 
method of comparing populations who were 
expected to be different in a personal time 
variable, This method has the serious draw- 
back of not eliminating the possibility that 
the groups vary on more than the two or 
three variables under consideration. To 
eliminate extraneous variables, a pretest- 
manipulation-posttest design was chosen in 
the present investigation. It was expected 
that the demonstration of intrapsychic dy- 
namics’ capacity to shorten or to lengthen 
protension would clarify our understanding 
of the personality variables presumed to be 
operative in the earlier studies. 

What intrapsychic dynamics may influence 
protension? Protension is shorter in subjects 
of lower socioeconomic class, delinquents, de- 
pressives, schizophrenics, etc., relative to 
controls. Those who have short protension 
seem to be under the influence of a common 
affect—unhappiness, anxiety, and/or dys- 
phoria. Lewin’s (1942) and Tomkins’ (1962) 
theories also suggest a relation between 
affects and protension. 

Specifically, it was hypothesized that mod- 
erate positive affect tends to lengthen pro- 
tension. When a person is happy, as in con- 
templating the attainment of a valued goal, 
he may be stimulated to consider other 
pleasures and successes, In order to gain addi- 
tional gratification, he extends his cognitions 
farther into the future, 

Conversely, it was expected that negative 
affect tends to shorten protension. When a 
person is unhappy, as from the threat of some 
unwelcomed experience, he may avoid the 
situation by withdrawing to less future expec- 
tations. In this process, he turns to matters 
which are generally more probable in having 
stronger linkages with the present and reality. 
Similarly, it was predicted that the expecta- 
tion of personal death, as an extremely un- 
welcomed event, tends to evoke unequivocally 
negative affect and, consequently, shortens 
protension. Though changes in retrotension 
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and extension were also under consideration, 
no predictions about either were made. 


METHOD 


The dependent variables in protension, retro- 
tension, and extension were each assessed by a direct 
measure and by an indirect measure. Half of each 
measure was given prior to the affect arousal pro- 
cedure, and the other half was given afterward. 
Subjects were randomly assigned to one of three 
affect arousal conditions, namely, unspecified posi- 
tive affect (pleasant or Pl), unspecified negative 
affect (unpleasant or Upl), and specified negative 
affect (Death). The sample of 70 men and 77 women 
were recruited from an introductory psychology 
class at Duke University. Men and women were 
tested separately in groups of about 10-20. 


Dependent Variable Measures 


The direct method, the personal association (PA) 
measure, permitted the separate or joint assessment 
of direction and extension of personal time. Each 
subject gave 20 associations in the pretest and 20 
more in the posttest, after which he recorded the 
temporal referents of all 40 associations. For each of 
the 40 items he was asked to “Give the actual date 
or approximate date when the thing occurred or 
probably will occur.” 

These dates were converted to scale scores to 
provide a more representative weight to each item. 
The scale was devised according to what seemed to 
be phenomenally meaningful units of future time 
to the undergraduate subjects at that time (early 
May): O— under 2 hours, 1=2 hours to under 1 
week, 2 — 1 week to under 1 month, 3 — 1-4 months, 
4= 4-12 months, 5— 1-4 years, 6 — over 4 years. 
A subject’s pretest protension score was the total of 
the protension values of his PA Items 1-20; his 
posttest score, the total of Items 20-40. A one-tailed 
test of significance was used to test the difference, 
posttest minus pretest. 

The PA retrotension score was similarly derived, 
using the same 7-point scale for PA protension in 
order to make the two measures comparable for 
analyses of covariation. The PA extension score Was 
the sum of the PA protension and retrotension 
scores. As there were no predictions made for 
changes of retrotension and of extension, two-tailed 
tests of significance were required. <a 

For the indirect method, the TAT was adminis- 
tered under group conditions according to Atkinson’s 
(1958, p. 837) recommendations, except that the 
minute-by-minute urgings were omitted, and the 
four questions were printed together at the top of 
the page, preceded by, “Be sure to include answers 
to the following questions in your story.” Epley 
and Ricks’ (1963) 11-point scale was used to derive 
protension and retrotension values from the TAT 
stories. The pretest and posttest TAT protension, 
retrotension, and extension, were the medians of the 
representative groups of individual story scores. 
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Reliabilities and Pretest Scores 


Two people independently scored all PA and TAT 
responses from all 147 subjects according to the 
above protension and retrotension Systems. The per- 
centages of identical PA scores, considering men and 
women separately, ranged from 93 to 96%. The 
correlations of scores obtained by the two scores of 
TAT protension and retrotension were .74 and .75 
for the entire sample (N= 147), .70 and .73 for 
the men (N = 70), and .77 and .76 for the women 
(N = 77), respectively. In short, both the PA and 
TAT measure of protension and retrotension had 
highly objective bases. 

The split-half reliability coefficient of the PA pro- 
tension pretest was calculated by applying the 
Spearman-Brown formula to the correlation between 
its odd and even halves. The odd-even reliabilities 
of the total group’s, men’s, and women’s PA pro- 
tension pretest were .74 (N=147), .78 (N — 70), 
and .69 (N — 77), respectively. Thus, the PA pro- 
tension measure had high internal consistency. 

The random assignment of subjects to the three 
pretest conditions provided an opportunity to check 
the TAT's internal consistency, and to obtain an 
additional check upon the PA measure. A £ test 
(two-tailed) for the difference between independent 
means was used between pairs among the three 
conditions, three time indices, and three subject 
groups.? Seven of the 27 TAT differences were signifi- 
cant. Women under Pl had shorter TAT protension 
and extension than women under Upl and Death. 
The total group under Pl had shorter TAT retro- 


„tension than the total group under Upl and Death. 


Finally, the total group under Pl had shorter TAT 
extension than those under Upl. Only 1 of the 27 
PA differences was significant: men under Upl had 
shorter PA retrotension than men under PI. 
Generally, the random assignment of subjects to 
the arousal condition led to comparable PA pretest 
Scores, but did not lead to comparable TAT pretest 
Scores. The significant between-condition differences 
in the pretest scores would invalidate the compari- 
sons of these groups’ change scores. Furthermore, 
these differences render the TAT method question- 


, able, if not unreliable, as a measure of extension 


and its components. The data of the present study 
did not offer an explanation of these differences. By 
the same token, the lack of significant differences 
among the PA measures increases confidence in their 
stability, and that differences obtained by them are 
not due to chance. 


?' The men's and women's pretest PA scores were 
also compared. Women had longer PA retrotension 
than men (M difference — 3.53, t= 2.65, df = 145, 
P < 01, two-tailed test). The sex differences in PA 
protension and extension were nonsignificant. 

Pretest scores of individual subjects ranged widely: 
pretest PA protension, retrotension, and extension 
ranged from 4 to 54, 0 to 40, and 15 to 78, respec- 
tively. Pretest TAT protension, retrotension, and 
extension ranged from 0 to 9, 0 to 7, and 1 to 16, 
respectively. 
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Affect Arousal Procedures 


In all conditions, the subjects were given instruc- 
tions to describe a future event, 4 blank pages, and 
12 minutes to complete the task. To the subjects’ 
questions about the task, the experimenter responded 
supportively and nondirectively. The instructions for 
the Pl condition were: 


Someday in the future you will have a quite 
pleasant experience. Please write a brief descrip- 
tion of this event as you imagine it will actually 
happen. In this account, deal with the circum- 
stances concretely and with your feelings in detail. 
Try to give as full and as vivid a picture as you 
can, 


The Upl condition was identical to the Pl condition, 
except that the first sentence in the instructions read: 
"Someday in the future, you will have a quite un- 
pleasant experience." The instructions in the Death 
condition read: 


Someday in the future, you will die. Please write 
a brief description of this event as you imagine 
it will actually happen. In this account, deal with 
the circumstances concretely and in detail. Try 
to give as full and as vivid a picture as you can. 


Under all three conditions, the subjects portrayed 
events which seemed indeed related to the intended 
affect. To assess the effect of the manipulation apart 
from the arousal prototols and the dependent meas- 
ures, a self-descriptive Immediate Mood Scale (IMS) 
was given before and after the arousal procedure 
to detect changes in affect, and a postexperiment 
interview (PEI) included unstructured questions 
about the affect arousal procedure. As both the IMS 
and the PEI required the subjects to respond 
cognitively, and as no physiological measures of 
affect were used, it is possible that the arousal 
procedures changed only the cognitions about the 
affects. Nevertheless, for the purposes of the present 
study, it is assumed that the arousal procedures did 
indeed manipulate affect. Support for this proposi- 
tion was furnished by the IMS and the PEI data 
which were the subjects’ direct reports of their 
construed feelings. For example, relative to the 
subjects’ self-descriptions of immediate mood on the 
IMS pretest, they described themselves on the IMS 
posttest as more happy and bold under Pl, as more 
clutched-up and apprehensive under Upl, and as 
more perplexed and pessimistic under Death. On the 
PEI, about half the subjects under each condition 
retrospectively reported feeling positive affect in the 
Pl condition and negative affect in the Upl or Death 
conditions. When the denial of affect responses (e.g., 
“I don't remember,” “I felt indifferent"), were 
combined with the negative affect responses, the 
percentages of success in arousing the intended 
affects became: Pl, 48; Upl, 65; and Death, 74. 

The sequence of the experimental operations was: 
Session I, TAT pretest, IMS pretest; Session II (1 
week later), PA pretest, one of the three arousal 
devices, TAT posttest, PA posttest, IMS posttest, 
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TABLE 1 
INFLUENCE OF THE AFFECT AROUSAL CONDITIONS UPON PROTENSION 
(POSTTEST PROTENSION SCORE MINUS PRETEST PROTENSION SCORE) 
TAT protension PA protension 
Arousal condition P 
sr peers f ? N | change | SP d Li 
S 
MISPodidun affect (PI) 36 | +1.15 | 241 ^ ij 004 1 tpe arf ue n 
Negative affect (Upland Death) 108 | —0.15 | 3.01 l| ns —5. f 3 d 
Ubi Sn y 36 | —0.10 | 3.40 | 0.17 | ms 37 | —4.30 | 10.14 | 2.58 | .008 
Death 72 | —0.18| 3.83 | 0.40 | ns 74 | —6.35 | 10.03 | 5.45 | .0001 
Men 
PI 17 |4-0.76 | 2.65 | 1.19 | ns 17 | +9.35 | 14.95 | 2.58 | .011 
Upl 17 | —0.24 | 231 | 0.42 ns 18 | —5.17 | 8.83 | 248 | .012 
Death 33 | —0.36 | 2.39 | 0.95 ns 35 | —6.14 | 10.28 | 3.35 | .002 
Women 

Pl 19 | +1.50| 2.12 | 3.08 | .004 19 |--1.68| 8.28 | 0.89 | ns 
Upl 19 | +0.03 | 4.13 | 0.03 ns 19 | —3.47 | 10.86 | 1.39 | <.09 
Death 39 | —0.08 | 4.57 | 0.11 ns 39 |—6.54| 9.79 | 4.17 | .0002 


Note.—The ¢ test (one-tailed test) for the difference between correlated means was used to determine the levels of significance 


of the mean protension change. 


scoring of all 40 PA items, PEI, explanation of the 
experiment, and the subjects promised not to tell 
others about it. 


RESULTS 


The study’s major results are the proten- 
sion changes listed in Table 1. A positive 
change indicates that protension was length- 
ened; a negative change, that protension was 
shortened. 


Moderate Positive Affect Tends to Lengthen 
Protension 


The PA data of the total group under Pl 
indicated a significant lengthening of pro- 
tension. The significant change was main- 
tained in the men alone, while it diminished 
to a nonsignificant level in the women, 

Apparently, the TAT data of the total 
8roup and women alone under Pl also sup- 
ported .the first hypothesis. However, this 
support is mitigated by fact that the pretest 
TAT protension of these groups under Pl 
was significantly shorter than the correspond- 
ing scores under the other conditions, The ob- 
served lengthening in the former groups may 
have been due to regression towards the mean. 
Thus, the TAT data were somewhat incon- 
clusive regarding the first hypothesis. 


Negative Affect Tends to Shorten Protension 


The two conditions testing this hypothesis, 
Upl and Death, were analyzed separately and 


together. The TAT data under either or both 
negative affect conditions did not support the 
hypothesis, In contrast, the PA data under 
each condition offered strong support for the 
hypothesis for the total group, men, and 
women (except women under Upl, where the 
change approached the level of significance, 
p< .09). 

Comparisons of TAT protension changes 
among conditions would have been invalid, as 
there were significant differences in the pre- 
test TAT protension means. All the PA pro- 
tension change comparisons between positive 
affect and negative affect conditions were in 
the predicted direction and were significant 
beyond the .002 level (except women PI-Upl, 
for which p < .06). 


Differences in PA Protension Changes as @ 
Function of Death Scores (DS) 


Precision in predicting a subject’s proten- 
sion change might be gained by referring to 
the actual event he anticipated in the arousal. 
The specification of the event under the Death 
condition permitted the analysis along dimen- 
sions presumed to be relevant, including age 
(60 or over, 40-59, 39 or under), type of 
death (peaceful, peaceful but violent possi- 
bilities mentioned, violent), significant others 
(attending or mourning, mentioned, not men- 
tioned). The subject’s DS scores ranging 
from 0 to 6 were compared with PA proten- 
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sion changes. The subjects who expected the 


' least traumatic death (low DS) were pre- 


dicted to experience the least shortening of 
protension; in other words, they would be 
those most likely to lengthen in protension. 
Comparisons confirmed these predictions for 
both men and women, as seen in Table 2. 


Components of the PA Protension Changes: 
Future Direction and Length 


The PA protension measure obtained in- 
formation which allowed the independent 
assessment of its two components, future 
direction and length alone. While PA proten- 
sion correlated highly with its two compo- 
nents, the two components’ correlation with 
each other was about zero (see Table 3). 


. Thus, each component may have played a 


vital and independent role in producing the 
observed PA  protension changes, Conse- 
quently, the changes in one component were 
examined while controlling for the other. 
Both future direction and length of PA 
protension tended to change as expected, to 
increase under positive affect and to decrease 
under negative affect. All six changes of the 


TABLE 2 


INFLUENCE OF THE SEVERITY OF ONE's DEATH 
Expectancy UPON His PA PROTENSION 
(Comparisons or DEATH SCORE AND 
CHANGE OF PA PROTENSION) 


Median 
AENEAM 
protension 
Total 74 
0 death score 13 ar! 
1-6 death score} 61 —8 | 134.5 | 3.72 | 0001 
1 death score 11 —4 
2-6 death score} 50 -9 132.0 | 2.68 | .0037 
d h 5 T8 
death score 
1-6 death score| 28 —9 22.5 | 3.11 | .0009 
1 death score 4 =5 
2-6 death score | 24 —10 41.5|0.48 | ms 
Women 2 
death score 
1-6 death score| 33 -17 51.5 | 1.85 | .0322 
1 death score 7 —4 : 
2-6 death score| 26 —8 72.5 | 1.72 | .043 


* Mann-Whitney U test (one-tailed test). 
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men alone were as expected, though two of 
the changes were nonsignificant. The changes 
of the women alone were not as uniform as 
expected, being influenced by the limitation 
of the measure. In sum, while length alone 
was the primary factor underlying the in- 
crease in PA protension under the positive 
affect condition, future direction was the 
primary factor underlying the decrease in the 
two negative affect conditions. 


Influences of the Affect Arousal upon Retro- 
tension and upon Extension 

While TAT retrotension tended to shorten 
under Upl, and tended to lengthen under 
both Pl and Death, all between-condition com- 
parisons were nonsignificant. 

PA retrotension manifested no significant 
changes under Pl for men, women, or the 
total group. All three groups of subjects un- 
der Upl and under Death extended signifi- 
cantly farther into the past on the PA meas- 
ure. PA retrotension changes, reduced to the 
two components of direction and length, were 
due to chiefly the, past-direction changes, 
though the length of PA retrotension tended 
to increase under Death. 

TAT extension manifested significant 
changes in only those two subgroups which 
had significant between-condition pretest 
differences, casting doubt on the reliability of 
these differences, The. net PA extension 
changes were usually small (between —2.00 
and +2.00) and nonsignificant. The two ex- 
ceptions were the total group and men under 
Pl, both of which tended to lengthen in PA 
extension, 


Intercorrelations among the Measures 


The intercorrelations for the total group, 
men, and women are reported in Table 3. The 
correlation between PA protension and TAT 
protension approached the level of statistical 
significance, but neither the future-direction 
or length component of PA protension when 
paired with TAT protension even approached 
significance. Neither PA retrotension nor its 
components were related to TAT retrotension. 
As mentioned above, future-direction and 
length of PA protension correlated about zero, 
as also were past-direction and length of PA 
retrotension (except for women for whom 
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TABLE 3 
INTERCORRELATIONS AMONG THE PRETEST MEASURES 
PA Future PA Past PA 
protension | Protension | direction | protension | retrotension | direction | retrotension 
PA protension length 
Total .62*** 
Men -60*** 
Women 64 
Future direction (PA) 
"Total TY eres —.03 
Men Ese —.08 
Women ivt .02 
TAT protension 
Total dis .03 -03 
Men .18* .02 .03 
Women 04 04 —.08 
PA retrotension 
Total -13* — m < 
Men —.30** 
Women Ager 
Past direction (PA; 
Total PY — — —.15* — pu 
Men —.18* 1 
Women —.13 64 
trotension length 
E spares m Mrs — — A2t** —.12* 
Men —.14 Are — 08 
Women dotes. Azer —.23* 
TAT retrotension 
Tot: nA E = A5*** 05 10 04 
Men WT. 01 .02 .01 
Women M ae ‘08 119% 07 


Note.—Ns for the r's of two PA measures: total = 147, men = 70, women = 77. All other Ns: total = 144, men = 67, 


women = 77, 
* p € 10. 


**p <01. 
*** p < .001. 


there was a significant negative correlation, 
r= —.23, N=77, p € 05). 

Sex differences were virtually absent, except 
in the pretest correlations of PA protension 
and PA retrotension and their lengths. Men 
obtained negative correlations while women 
obtained positive ones. A similar sex differ- 
ence in the covariation of PA protension and 
retrotension also occurred in the changes 
under Pl (x? = 11.0, p< .001, two-tailed). 
Significant sex differences in this covariation 
did not occur in changes under Upl or Death. 


Discussion 


The propositions that positive affect length- 
ens protension, and that negative affect short- 
_ens protension, were clearly confirmed by the 
PA data, that are based on events from the 
individuals’ own lives. Negative affect di- 
minished the frequency of cognitions concern- 
ing the future, and increased the frequency 


of cognitions concerning the past. Negative 
affect also shortened the length into the fu- 
ture of the “future” cognitions, though the 
shortening was greater among men than 
among women, and under the Death condition 
than under the Upl condition. 

Consistent with earlier work on personal 
time, the present investigation extended our 
knowledge of the specific mechanisms which 
may mediate in relationships of personal time 
variables and other variables. Of the various 
disturbed conditions related to short proten- 
sion, and referred to above, the depressive 
process is one of the more distinguishable 
processes. The present study directly simu- 
lated a depressive process. Anticipating un- 
welcomed future events did elicit negative 
affect, as one may infer especially from the 
IMS and PEI data. The negative affect, !n 
turn, threw the subjects back to their pasts. 
This evidence was consistent with the official 
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. diagnostic description that the quality of the 


depressive’s thinking is a function of his nega- 
tive affective state. However, the present data 
also offered support for an alternative etio- 
logical theory of depression, that of down- 
ward-spiraling, reciprocal interaction between 
affect and cognition (Tomkins, 1962). 

Osgood (1962b) and Tomkins (1962) con- 
tend that affects are important determinants 
of behavior. The present results suggested one 
such behavior-determining process. Positive 
and negative affect have opposite effects upon 
PA protension, which, as a representation of 
a person’s anticipatory processes, appears to 
be vital in his decision making and in his 
overt behavior, 

Tomkins’ (1962, pp. 328-330) theory 


. Postulates that in human beings negative 


affect should be minimized. In the present 
investigation's negative affect conditions, the 
foreshortened protension appears to have been 
a cognitive manifestation of a process, like 
denial or repression, that minimized affects. 
The data of the positive affect condition sup- 
ported a reciprocal process that has the po- 
tential to lengthen protension, 


.. The above analysis is cogent when proten- 


sion alone is considered. Does it hold for ex- 
tension as well? It does under positive affect, 
but not under negative affect. Under positive 
affect, extension lengthened, as protension 
lengthened while retrotension did not change. 
Under negative affect, extension did not 
change significantly, as retrotension length- 
ened, almost hydraulically, while protension 
shortened. 

When extension is the basic unit of analy- 
sis, the results under negative affect are con- 
sistent with the interpretation that the Lew- 
inian construct of life space is relatively 
stable in its temporal dimensions. According 
to this view, the individual’s rather fixed ex- 
tensions simply shifted backwards in time 
under negative affect. However, this view is 
flatly contradicted by the data under positive 
affect where the total extension was not 
stable, but rather, expanded. Further investi- 
gations, using past as well as future events as 
arousals, might resolve the discrepancy be- 
tween the competing theories. 

One consequence of the present study, thus, 
is the demonstration of a need to reevaluate 


565 


the concepts “time perspective,” “time orien- 
tation,” and even “extension of personal 
time.” To represent more accurately the 
phenomena in question, one of the general 
concepts was delineated into direction and 
extension, and the latter’s components of 
protension and retrotension, If extension’s 
bidirectional components were «not distin- 
guished, the lawful characteristics of PA pro- 
tension would have been obliterated under 
the negative affect conditions, 

The central issue concerning methods 
raised by the present study was the discrep- 
ancy between the PA and TAT. methods. 
Previously, the direct and indirect methods 
have been generally presumed to measuré the 
same thing. The PA protension measure mani- 
fested expected changes while the TAT meas- 
ure did not, in spite of the fact that the TAT 
was given immediately after the affect arous- 
als, and the PA was given after the TAT, 
about 20 minutes after the affect arousals. 

Moreover, pretest PA and TAT protension 
was correlated at the nonsignificant level of 
.11. While this corfelation may have been 
attenuated somewhat by a 1-week interval 
between the two tests, it was very similar to 
the only other reported correlation, .12, be- 
tween direct and indirect measures (Graves, 
1962). The present study confirmed the ob- 
servation of a literature review (Wohlford, 
1965a) that direct and indirect measures of 
protension, while phenotypically similar, may 
reflect quite different genotypic variables. 
The PA and TAT were different on: the 
structure of the task, the level of conscious- 
ness studied, and whether the response was 
autobiographical. Especially, the TAT task 
was more structured than the PA. task in 
specifying the temporal direction, the story 
form, and the stimulus cue. 

Nonsignificant correlations were also ob- 
tained between future direction and length of 
PA protension, and past direction and length 
of PA retrotension. Pilot data suggested a 
possible interaction of personality disposition 
variables with variables of personal time. 
"These are areas for further research. ' 

The major sex differences in this study 
occurred among components of PA protension 
and PA retrotension. When the PA protension 
components were combined, the sex differ- 
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ences were considerably attenuated, verifying 
the PA protension scoring system. The women 
under Pl conformed least to expectations. As 
the experimenter was male, and at least some 
subjects associated the Pl task with sex, 
there may have been a differential defensive- 
ness operating under Pl that made the re- 
sponses of women more varied than those of 
the men. 


Expectation of Personal Death 


As well as lengthening in protension, those 
who expected the least traumatic deaths (low- 
est DS scores) tended to admit a greater 
conscious fear of death, and tended to report 
thinking about death at an earlier age, than 
those who expected more traumatic deaths 
(Wohlford, 1965b). The former individuals 
appear to have integrated a realistic dread of 
death's eventuality into their life plans, and 
seem able to cope more constructively with 
the stress imposed by thinking about personal 
death. 

To have someone mourn after one’s death 
has been alleged to be'a motive of suicide 
(Menninger, 1938). While this may be true, 
to be missed after one’s death also appears as 
a factor in a constructive or “healthy” expec- 
tation of personal death, as noted above. As 
many have contended, strong social ties may 
be a partial antidote to nihilism in the face 
of death. 

The present study substantiates and ex- 
tends Osgood's (1962a) explanation of how 
stress inhibits constructive action. An indi- 
vidual's thought about the horror of nuclear 
war may have its greatest impact from the 
threat that he himself will die. This realiza- 
tion and its accompanying negative affect 
probably shortens his protension and de- 
creases the frequency of his cognitions con- 
cerning the future. In turn, this may render 
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him more reluctant, less likely, or less effec- 
tive to act instrumentally on his own behalf 
for the attainment of a long-range goal, such 
as the prevention of nuclear war. 
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ASSIMILATION AND CONTRAST IN INTERPERSONAL 
PREDICTION WITH CONTROL FOR. THE INTER- 
ACTION OF REAL SIMILARITY AND 
DIFFERENTIAL ACCURACY * 


WILLIAM A. BLANCHARD ? 


University of Oregon 


152 judges (Js) predicted the responses of 8 targets (Ts) to interest-test items. 
4 of the Ts were rated by the J to be high in similarity to himself (HPS) and 
4 were rated to be low in similarity to himself (LPS). Artifactual interaction 
between real similarity and differential accuracy was controlled by balancing 
the number of items on which there was a real similarity between J and T 
with those on which there was a real dissimilarity between J and T. Following 
Berkowitz, it was predicted that Js would assimilate under HPS and contrast 
under LPS. Contrary to expectations, Js tended to assimilate under both con- 
ditions. However, there was a large and significant difference between as- 
similation effects under these 2 conditions, The effect was large under HPS and 
small under LPS—a difference between conditions which is consistent with the 
assimilation-contrast model of social judgment. 


The present research has two purposes: to 
examine an extension of the analogical as- 
similation and contrast model of social judg- 
ment proposed by Berkowitz (1960) to the 
perhaps more restricted area of interpersonal 
prediction; and to examine the usefulness of 
a method for the experimental control of the 
interaction between real similarity and differ- 
ential accuracy in prediction. 

Berkowitz (1960) has proposed that a use- 
ful analogy might be drawn between the con- 
ditions that lead to assimilation and con- 
trast effects in psychophysical judgments, and 
certain conditions affecting interpersonal 
judgments. In psychophysical judgments, the 
relationship between the magnitude of an 
anchoring stimulus and the magnitude of a 
variable stimulus to be: judged has been 
shown to have significant effects on the cate- 
gory placement of the variable stimulus 
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(Sherif, Taub, & Hovland, 1958). When the 
anchoring stimulus is relatively close to the 
variable stimuli on the physical continuum, 
the judgments of the variable stimuli tend to 
be shifted toward the anchor, that is, assimi- 
lation occurs, When, however, the anchor is 
relatively far from the stimuli to be judged, 
judgments tend to be shifted away from the 
anchor, that is, contrast occurs. Berkowitz 
has proposed that in interpersonal judgments 
the individual's self-concept serves as the 
anchoring point of a standard stimulus. In 
terms of the analogy, this should lead to 
assimilation effects in interpersonal judg- 
ments when the social object is perceived to 
be highly similar to the self, and to contrast 
effects when the social object is perceived to 
be highly dissimilar to the self. 

The proposed processes of assimilation and 
contrast can be seen as closely related con- 
ceptually to those of assumed similarity and 
assumed dissimilarity—a terminology more 
common in the area of interpersonal predic- 
tion. As with assumed similarity and assumed 
dissimilarity, assimilation and contrast might 
be most simply defined in terms of the agree- 
ment or disagreement between the judge's 
(person making the prediction) own responses 
to the items on a forecast measure and his 
predictions for the target (person whose be- 
havior is being predicted) on those items. If 


. 567 


568 WiLLIAM A. 
a judge were assimilating we would expect 
that his predictions for the target would be in 
close agreement with his own responses to 
the items on the test. Conversely if he were 
contrasting we would expect that his own 
responses and his predictions for the target 
would be largely in disagreement. The fre- 
quency of agreement between these two rec- 
ords could then be used as an index of as- 
similation-contrast. 

There is, however, a difficulty in the in- 
terpretation of this convenient measure. This 
difficulty arises because of the failure of this 
Scoring procedure to take into account the 
effects of the interaction of real similarity 
between judges and targets on the prediction 
items, and differential accuracy in the judges’ 
predictions. As Cronbach (1955) points out, 
the difficulty lies in the fact that as the pro- 
portion of items on the test on which there is 
a real similarity between the judge and the 
target departs markedly in either direction 
from .50, assimilation and contrast scores (as 
defined in terms of the operations above) and 
accuracy scores will be increasingly corre- 
lated. This correlation will be an artifact of 
the items used. 

This interpretational difficulty can best be 
seen in the hypothetical example of a judge 
and target who are in perfect accord in their 
responses to the items on the forecast meas- 
ure. Say, for example, both judge and target 
respond “like” to all 10 interest items on a 
10-item forecast measure. If the judge then 
predicts that the target will respond “like” to 
these 10 items he will get a perfect assimila- 
tion score on this test. He will also get a 
perfect accuracy score. It is impossible in this 
situation to tell whether the judge’s accuracy 
is a result of accurate prediction based on 
relevant information about the target or is a 
fortuitous result of assimilation that just hap- 
pened to be in accord with reality because of 
the chance selection of these particular 10 
items. Conversely it is impossible to tell 
whether his high assimilation score is a re- 
sult of assimilation or because he is a highly 
accurate judge. 

One possible method of controlling this 
confounding interaction is to balancé the 
number of items on the test on which there 
is a real similarity between the judge and 
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target (hereafter referred to as congruent, 
items) and the number of items on the test 
on which there is a real dissimilarity between 
the judge and the target (hereafter referred 
to as discrepant items). This procedure 
equalizes real similarity between all judges 
and all targets. It also enables us to phrase 
assimilation and contrast in terms of rela- 
tive error on sets of items with known and 
relevant characteristics (congruent and dis- 
crepant). This provides for some measure of 
control for individual differences in judges in 
what Cronbach has called differential accu- 
racy. 

With this control for real similarity it is 
possible to phrase the hypotheses of the pres- 
ent study in terms of relative error. The first 


expectation, following Berkowitz, is that when . 


a judge perceives a target to be generally 
similar to himself he will assimilate. We 
would expect that he will predict that the 
target's behavior is similar to his own more 
often than is warranted by reality. In terms 
of relative error, then, we would expect that 
the judge would make more errors on dis- 
crepant items than on congruent items (De 
> Ce; where De represents errors on dis- 
crepant items and Ce represents errors on 
congruent items). The second expectation is 
that when the judge perceives the target to 
be generally dissimilar to himself he will con- 
trast. In terms of relative error we would 
expect that the judge under these conditions 
would make more errors on congruent’ items 
than on discrepant items (Ce > De). We 
would expect that he would predict that the 
target’s behavior is dissimilar to his own 
more often than is warranted by reality. 

Under these conditions a highly accurate 
judge may make fewer errors than an inac- 
curate judge without seriously confounding 
the interpretation. Assimilation or contrast 
will be reflected in the differences in error on 
the congruent and discrepant items. 

The usual operation defining assumed- 
similarity scores (here, assimilation) requires 
two response records: the judge’s own Te- 
sponses to the items and the judge's predic- 
tions for the target. The present method re- 
quires, in addition, a third response record— 
that of the target, The selection of congruent 
and discrepant items is made by comparing 
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the judge's response record with the target's 
response record. This can be seen as a variant 
on the use of the “intermediary key" in the 
analysis of interpersonal perception scores 
suggested by Gage, Leavitt, and Stone (1956). 


METHOD 
Subjects 


A total of 152 subjects drawn from two intro- 
ductory psychology classes at the University of 
Oregon served as judges in this study (75 women 
and 77 men). There were two samples—one of 24 
women and 22 men, and a second of 51 women and 
53 men. Since no significant differences between 
samples were found they have been combined in 
the report to follow. However, since the male and 
female subjects worked with somewhat different 
materials, the data are reported separately by sex. 


Forecast Measure Construction 


Judges response records. Al judges expressed 
their own preference for a set of interest-test items 
selected from the Strong Vocational Interest Blank 
(1946 revised). The men responded to items from 
the men's blank and women to items from the 
women's blank. All items on the original Strong 
which could be responded to with “like” or “dis- 
like" were included in the test (men's — 280, wom- 
en's — 294). The instructions and the answer sheets 


.were modified to allow only dichotomous like or 


dislike responses. 

Targets’ response records. The targets in the pres- 
ent study were "typical" individuals in the Strong 
Occupational criterion groups, for example, the 
"typical" lawyer, "typical" physician, etc. It has 
been clearly shown that the modal responses of 
these criterion groups, as determined by Strong's 
scoring methods, can be predicted by college stu- 
dents with a high degree of accuracy (Blanchard, 
1959). 

The response record for each of these targets con- 
sisted of a keyed set of items which were. selected 
in accordance with the following criteria of scora- 
bility: 

1. All items for which the percentage of response 
of Strong's men-in-general and women-in-general 
groups exceeded 60 or was below 10 for either end 
of the like-dislike response dichotomy were ex- 
cluded. This step was taken to minimize items on 
which correct predictions could be made for a great 
majority of the targets from a simple stereotype of 
men-in-general or women-in-general. _ 

2. A preliminary pool of items was then made up 
for each of the separate occupation groups on the 
men’s and women’s Strong, respectively. This pre- 
liminary pool consisted in each case of those items 
which, within the Strong scoring system, have non- 
zero weights of unlike sign for the two poles of the 
response dichotomy for that occupational group. 
The rationale for this procedure was to obtain items 
which differentiated thé interests of the men and 
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women in each occupational group from those of 
men- and women-in-general on both poles of the 
response dichotomy. 

3. The percentage of response figures for each of 
Strong’s criterion groups were then examined indi- 
vidually for each item and all items in each pre- 
liminary pool which did not have at least a 6% 
differential in favor of the positively weighted pole 
were eliminated.3 

4. In the selection of items for the individualized 
forecast measures (described below), items with 
weights of +2 or greater were used only when 
necessary. There is some evidence (unpublished data) 
that items with +2 weights or greater are gen- 
erally easier to predict than those with lesser 
weights. Since it was felt that the phenomena that 
we wished to study would be more in evidence on 
the more difficult (or perhaps we could call them 
more ambiguous) items, it was decided to use only 
those items with +1 weights whenever possible. 

Of the 280 items on the men’s Strong and the 
294 on the women’s, 190 and 194, respectively, were 
ultimately used in the construction of the forecast 
measures to be described below. 

The items finally selected for each occupational 
target group were entered on a response sheet like 
those used by the judges. The response choice which 
was positively weighted defined the response of the 
“typical” individual in the target group. These keys 
constituted the targets* response records. 

Selection of targets for each judge. Two groups of 
targets were individually selected for each judge; 
four targets perceived by the judge to be highly 
similar to himself, and four targets perceived by 
the judge to be highly dissimilar to himself. 

Perceived similarity and perceived dissimilarity 
were defined in terms of a Rating Scale for Judg- 
ments of Similarity (RSJS) to self. The materials 
constituting this measure consisted of: a page of 
instructions, a list of occupational groups from the 
Strong (39 for the males, 24 for the females), and 
a vertical scale segmented into 45 spaces, In addi- 
tion to the numerical scale, nine titles, listed at equal 
intervals along the left-hand margin of the scale, 
were used to give the subjects a general idea of the 
meaning of the various parts of the scale. These 
titles constituted a verbal scale running from “Al- 
most completely dissimilar to myself" at the bot- 
tom of the scale, through “Neutral: Neither similar 
nor dissimilar to myself” at the midpoint of the 
scale, to "Almost completely similar to myself" at 
the top. The subjects were instructed to place each 
of the occupational titles somewhere on the rating 
scale—in each case in accordance with their own 
feelings of the general similarity or dissimilarity 
between themselves and the “typical individual” in 
each of the occupations listed. The four occupa- 


8We wish to express our appreciation to L. G. 
Nicholson and to the late E. K. Strong for their help 
in making available the percentage-response figures 
for the men-in-general, women-in-general and oc- 
cupational criterion groups. sir ihain 


570 WILLIAM A. 


tional groups receiving the highest ranks and the 
four receiving the lowest ranks were selected as the 
high-perceived-similarity (HPS) and the low-per- 
ceived-similarity (LPS) targets, respectively.* 

Individualized forecast measure. A forecast meas- 
ure was prepared for each subject individually. This 
instrument consisted of eight sections bound to- 
gether in a test booklet. In each section of the test 
booklet, the judge was asked to predict the responses 
of one of the selected targets to 20 interest-test 
items. Ten of these items were congruent items, 
according to our definitions above, and 10 of them 
discrepant. 

The items to be predicted for each target were 
selected by comparing the response record of the 
judge with the response record of the target. Ten 
congruent items were selected first, and then 10 
discrepant items. In each case the item selection was 
random from among the congruent and discrepant 
items on the target’s response record. 

The size of the congruent and discrepant item 
pools from which these selections were made varied, 
depending upon the number of items on the tar- 
get’s response record and upon the real similarity 
between the judge and the target on these items, 
The mean congruent item pool for men was 21; 
the mean discrepant item pool, 16. The lowest mean 
item pools for male targets were for artist (con- 
gruent M = 14) and for credit manager (discrepant 
M= 10). » 

For females the mean congruent item pool was 
24; the mean discrepant item pool, 21. The lowest 
mean item pools were for nurse (congruent M = 
14) and for social science teacher (discrepant M = 
12). 

The 20 items so chosen were then typed on pre- 
pared blanks with standardized instructions mimeo- 
graphed on them. These sets of items—each set 
representing a single target—were then bound to- 
gether with answer sheets and a set of general in- 
structions, into the test booklet. Each judge, thus, 
made predictions on 20 items for each of eight dif- 
ferent targets—a total of 160 items, 80 of which 
were congruent and 80 of which were discrepant,’ 
Four of the targets were HPS and four LPS. 


*Of the 39 possible targets for the male judges 
the lowest frequency of choice was for purchasing 
agent (N = 3). The highest frequency was for mor- 
tician (N = 35). The median frequency was 15. The 
differential frequency of the choice of the occupa- 
tional groups as HPS and LPS targets also varied. 
The greatest differentials were for mortician (1 
HPS and 34 LPS) and lawyer (22 HPS and 1 LPS). 

For the females:the lowest frequency of choice 
was for dietician (NV = 10); the highest for mathe- 
matics-physical science teacher (N = 43). The me- 
dian was 23. The greatest differentials were for life 
insurance saleswoman (1 HPS and 39 LPS), and for 
English teacher (28 HPS and 5 LPS). 

, > More precisely, each judge made 160 predic- 
tions. There was some repetition of items within 
each test. On the average there were 104 different 
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Scoring. Each judge's predictions were compared 
with the relevant key and scored as either correct or 
incorrect. This procedure provided four scores for 
each judge. 


1. HiCe: This is the total number of incorrectly 
predicted congruent items summed over the four 
targets rated as highly similar to the self. 

2. HiDe: This is the total number of incor- 
rectly predicted discrepant items summed over 
the targets rated as highly similar to the self. 

3. LoCe: This is the total error on congruent 
items for the targets rated low on the rating scale 
for similarity to self. 

4. LoDe: This is the total error on discrepant 
items for targets rated low on the rating scale. 


In addition to these more directly derived scores, 
the algebraic difference between De and Ce scores 
(De— Ce) provided an index of assimilation and 
contrast. A positive De — Ce difference score is 
taken as indicative of assimilation. A negative De— 
Ce difference score is taken as indicative of contrast. 


RESULTS 


Table 1 presents the means, standard devi- 
ations, and chance means of the relevant 
error scores for each of the samples. It is 
interesting to note that, as with the subjects 
in an earlier study (Blanchard, 1959), the 
subjects in the present study were able to 
predict the modal responses of Strong's cri- 
terion occupational groups with considerably 
better than chance accuracy. The highest 
mean total error score (Sample II, males) was 
56.4. The probability of anyone obtaining 


items in a test booklet (101 for the men and 106 
for the women). The remaining items (M = 56) 
were repetitions of one or another of these items. 
There was, of course, no repetition of items within 
the sets of 20 for the individual targets. 


TABLE 1 


Means, STANDARD DEVIATIONS, AND CHANCE MEANS 
OF SCORES DERIVED FROM THE INDIVIDUAL 
Forecast MEASURES 


Females Males 
(N = 75) (N =77) | chance 
Score means 
M SD M SD 
LoCe (40) 11.2 | 4.26 | 13.3 | 424 | 20 
LoDe (40) 13.0 | 4.26 | 14.7 | 455 | 20 
HiCe (40) 8.4 | 3.39 | 9.1 | 3.69 | 20 
HiDe (40) 17.5 | 3.80 | 19.1 | 479 | 20 
LoDe—LoCe 19 | 5.78 | 1.4 | 6.33 
HiDe—HiCe 9.1 | 4.82 | 10.0 | 6.54 
Total error (160) | 50.1 | 7.89 | 56.2 | 8.67 | 80 


. ~ Number of items appears in parentheses. 
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- such a score by chance alone is less than 
.00002 


Xc— X 

VNpPq 
The highest total error score obtained was 
77. Only 12 of the 152 subjects obtained 
scores greater than 68—a score which is 2 
standard deviations below the chance mean. 


This has some implications for the fakability 
of the Strong. 


(Z= = 4.25). 


Hypothesis I 


It was predicted that under conditions of 
high-perceived similarity (hereafter referred 
to as HPS) the judges would make more dis- 
crepant-item errors than congruent-item er- 
rors (De > Ce). 

Table 2 summarizes the data relevant to 
this hypothesis. 

The sign test (Siegel, 1956) was used to 
evaluate the results. 

This Hypothesis is supported by the data of 
the present study. The relevant results are in 
the predicted direction and highly significant. 
Out of the 152 subjects tested, 142 made 
more errors on discrepant items than on con- 
gruent items when predicting for targets which 
they had rated as highly similar to themselves. 
Only 6 of the 152 made a greater number of 
errors on congruent items than on discrepant 
items under this condition. The mean differ- 
ence scores (HiDe — HiCe) reported in Table 
1 indicate that this difference is not only sig- 
nificant in terms of frequency but also is a 

+ sizable and stable one. 


Hypothesis II 

It was predicted that under conditions of 
low-perceived similarity (hereafter referred to 
as LPS) the judgés would make more con- 


TABLE 2 


COMPARISON OF THE FREQUENCY OF DE > CE WITH 
THE FREQUENCY OF CE > De For HPS 


N |De»Ce|Ce»De|De-Ce| Z 
Female | 75 71 2 2 8.08* 
Male 77 71 4 2 7.14* 
Total| 152 | 142 6 4 


* p « .00006, two-tailed test. 
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TABLE 3 


COMPARISON OF THE FREQUENCY OF CE > DE WITH 
THE FREQUENCY OF DE > CE ror LPS 


N Ce > De | De > Ce | De = Ce PA 


Female | 75 27 45 3 2,12 
Male 77 26 45 6 2.26 
Total | 152 58 90 9 3.09* 


* p'< .01, two-tailed test. 


gruent-item errors than discrepant-item errors 
(Ce > De). 

A summary of the analysis relevant to this 
hypothesis can be seen in Table 3. The sign 
test was again used to evaluate the results. 

About one-third of the subjects had differ- 
ences between Ce and De scores under LPS 
that were in the predicted direction. The dif- 
ferences for the other two-thirds of the sub- 
jects were in a direction opposite to that pre- 
dicted. 

Although the frequency of De > Ce is not 
sufficiently greater than the frequency of Ce 
> De to be significant in the separate male 
and female sample§, the combined data do 
show this difference in frequency to be reli- 
able. A significantly greater number of judges 
made more errors on the discrepant items 
than on congruent items even under LPS. As 
can be seen in Table 1 the differences between 
mean LoDe and LoCe scores is small, but 
positive. Hypothesis II is not supported in 
the data of this study. 

Taken as a group then, the subjects of 
the present study would appear to assimilate 
under both conditions, HPS and LPS. The 
question that this raises is whether or not 
the two conditions had any differential effect 
on the kinds of errors that were made. Com- 
paring the De — Ce indices under these two 
conditions individually for each subject shows 
that for 19 of the 152 subjects the difference 
score, De — Ce, was greater under LPS than 
under HPS. For three subjects these indices 
were equal under both conditions. For the 
remaining 130 subjects De — Ce was greater 
under HPS than under LPS. This difference 
is highly significant (sign test, z — 9.0, p 
< .00006, two-tailed test). This difference in 
the tendency to assimilate under HPS and 
LPS conditions can also be seen in Table 1. 
The mean HiDe-HiCe score for both men and 
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women is positive and relatively large (males, 

M= 10.0; females, M = 9.1). The mean 

LoDe-LoCe score for all samples is also posi- 

tive, but relatively small (males, M — 1.4; 

females, M — 1.9). 

Although the contrast effects predicted are 
not evidenced in the data there is a significant 
difference in the tendency to assimilate under 
the two conditions, HPS and LPS. A differ- 
ence in a direction consistent with the general 
proposal. 

DISCUSSION 


The proposal that assimilation and contrast 
effects, analogous to those found in psycho- 
physics, would be found in interpersonal pre- 
dictions has not been clearly substantiated 
in the data of this study. Although the results 
support the expectation of assimilation effects 
at a high level of confidence, the expectation 
of contrast effects has not been substantiated. 
Any more generalized negative conclusion, 
however, must be moderated by the fact that 
the two conditions, HPS and LPS, do effect 
a significant difference iħ the relative kinds 
of errors made by the judges—a difference 
that is congruent with the general proposal. 

This shift in the relative numbers of con- 
gruent and discrepant item errors, in a di- 
rection but not size, consistent with the 
assimilation-contrast analogy suggests that 
part of the difficulty may be methodological 
rather than theoretical. It is possible that 
the range of possible targets provided the 
judges was too limited. Since no objective 
anchoring points were given the judges in 
their use of the RSJS it may be assumed 
that the anchors were provided by the context 
of the pool of targets. In such a case, it is 
possible that the targets rated low on the 
RSJS were in reality middle-range objects 
on the underlying psychological dimension. 
If more extreme occupations in terms of our 
cultural norms had been included, the rela- 
tive scale values might have better approxi- 
mated the psychological scale values, and the 
resulting distribution of congruent and dis- 
crepant item errors might have supported 
Hypothesis II. 

In any case, it is clear that the assimilation- 
contrast model does enable us to predict: the 
positive De-Ce index under HPS, and the 
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difference in this index between HPS and 
LPS. This also suggests that the De-Ce index 
is a meaningful measure, and one which may 
prove fruitful in further investigations. 

It also seems clear, however, that as a model 
of the prediction processes the assimilation- 
contrast analogy is incomplete. Even under 
HPS, where the model provides the best fit, 
we can make better predictions of the judge's 
predictions by simply saying that the judge 
will be accurate in all his predictions (see 
Table 1). It should be possible to make more 
refined statements with regard to expectations 
of differential assimilation and contrast ef- 
fects. This will require, not necessarily dis- 
carding the assimilation-contrast notion, but 
rather a refinement of our analysis of the 
conditions under which we can expect to find 
such effects, 

There are a number of situational factors 
which might influence a judge's predictions 
besides perceived similarity. We would like 
to speculatively suggest two which we feel are 
important enough to warrant detailed study. 

l. Information relevance. By information 
relevance is meant the predictive validity of 
the information given to the judge with 
respect to the predictions he is asked to 
make. This would seem to be an important 
variable in determining a judge's predictions. 
As such it would have a potent influence, not 
only on accuracy scores, but also on other 
scores which serve as indicators of other 
processes, such as the De-Ce index. This vari- 
able, surprisingly, has received little attention. 

2. Personal relevance. It would seem reason- 
able to expect that if the item on which the 
prediction was being made had little personal 
relevance for the judge it would make little 
difference to him whether another person was 
similar or dissimilar to himself in that respect. 
It is rather on those items which have per- 
sonally important implications for the judge 
that we would expect the distorting influences 
that produce the kinds of error we have 
identified as assimilation and contrast to be 
most effective. 

Our ignorance with regard to the influence 
of these and other factors supports Cron- 
bach's (1955, 1958) advocacy of a more 
complete and intensive analysis of the pre- 
diction process. The bulk of research on inter- 
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« personal prediction has focused on individual 
differences and their correlates. It seems 
clear, however, that before we can understand 
individual variability in predictive behavior 
we will need a better conceptual grasp of, 
and methodological control over, the situa- 
tional variables, both nomothetically and 
idiographically evaluated, which influence 
peoples' predictions of others' behavior. 


REFERENCES 


BrRKowrrz, L. The judgmental process in person- 
ality functioning. Psychological Review, 1960, 67, 
130-142. 

BraAwcHamp, W. A. Cognitive complexity and the 
accuracy of- stereotypes. Unpublished master’s 
thesis, University of Oregon, 1959. 


573 


Cronsacu, L. J. Processes affecting scores on “under- 
standing of others" and “assumed similarity." 
Psychological Bulletin, 1955, 52, 177-193. 

CronpacH, L. J. Proposals leading to analytic 
treatment of social perception scores. In R. 
Tagiuri & L. Petrullo (Eds.), Person perception 
„and interpersonal behavior. Stanford: Stanford 
University Press, 1958. Pp. 353-380. 

Gace, N. L., Leavitt, G. S, & Stone, G. C. The 
intermediary key in the analysis of interpersonal 
perception. Psychological Bulletin, 1956, 53, 258- 
266. 

Smer, M., Taus, D. & Hovrawp, C. Assimilation 
and contrast effects of anchoring stimuli on judg- 
ments, Journal of Experimental Psychology, 1958, 
55, 150-155, 

SIEGEL, S. Nonparametric statistics for the behavioral 
sciences. New York: McGraw-Hill, 1956. 


(Received March 23, 1965) 


ERRATUM 


On page 185, Column 2, of the article “Effects of Subject and Experimenter Attitudes in 


Verbal Conditioning," by James H. Bryan and Edward Lichtenstein (Journal of Personality and 
Social Psychology, 1966, 3, 182-189), line 5 should read: 


conditioning (r=.65). ` 
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AGGRESSION TOWARD OUTGROUPS AS A FUNCTION 


OF AUTHORITARIANISM AND IMITATION 
OF AGGRESSIVE MODELS 


RALPH EPSTEIN 
Wayne State University 


This experiment investigated the imitation of aggression towards outgroups 
as a function of the observer's personality characteristics and the stimulus 
characteristics of the aggressive model. Ss were randomly assigned to 8 ex- 
perimental conditions in a 2 X22 factorial design which was based on 
the following independent variables: observer's personality structure (au- 
thoritarianism) and the racial, socioeconomic characteristics of the model. The 
dependent variable, imitative aggression, was defined in terms of shocks ad- 
ministered to a Negro victim during a serial learning task. The findings that 
different ethnic models elicited comparable aggressiveness from high Fs whereas 
the low Fs were more imitative of a Negro than a white model were interpreted 
in terms of the undifferentiated cognitive functioning of high authoritarians. 
The finding that ethnic similarity between the victim and the model facilitated 
imitative aggression was evaluated in relation to current theories regarding out- 


group hostility. 


Although an increasing body of correla- 
tional evidence points to the role of imita- 
tion of ingroup attitudes as a determinant of 
prejudicial attitudes (Epstein & Komorita, 
1966; Mosher & Scodel, 1960), experimental 
studies of imitatively derived hostility to- 
wards outgroups are virtually nonexistent. 
The potential fruitfulness of such studies 
is suggested by research demonstrating the 
imitative basis of diverse behavioral sys- 
tems, for example, aggression (Bandura, 
Ross, & Ross, 1961), moral judgments 
(Bandura & McDonald, 1963), and autism 
(Eisenberg, 1957). The current study at- 
tempts to extend the range of investigated 
behaviors by focusing on imitation of overt 
aggression as manifested by the administra- 
tion of shock to a victim. Furthermore, 
whereas previous investigations of imitation 
have focused upon either the observer’s per- 
sonality characteristics, that is, self-esteem 
(deCharms & Rosenbaum, 1960) or the 
models characteristics, that is, status 
(Bandura & Kupers, 1964), social approval 
(Gelfand, 1962), the current study assumed 
that maximal prediction may be obtained by 
investigating the interaction between the ob- 
server’s personality structure and the model's 
stimulus characteristics upon aggression 
towards outgroups. 


Thus, the goal of this exploratory study 
was to investigate the personality character- 
istics of the observer and the stimulus char- 
acteristics of aggressive models, that is, race 
and social status, as determinants of imitative 
aggression towards a Negro victim. On the 
basis of theory derived from the Authoritarian 
Personality (Adorno, Frenkel-Brunswik, Lev- 
inson, & Sanford, 1950), as well as research 
by Hart (1957), it may be assumed that 
authoritarian attitudes are a function of pa- 
rental punishment for independent, autono- 
mous behavior and parental approval for con- 
forming, imitative, and submissive behavior. 
Therefore, it was predicted that high authori- 
tarians will be more imitative of an aggres- 
sive model relative to low authoritarians. 
Furthermore, insofar as authoritarian atti- 
tues result from harsh and punitive discipline 
which lead to excessive sensitization to power 
relations of strong versus weak, superior 
versus inferior, as these dimensions are 
culturally defined by social status, it was 
predicted that whereas high authoritarians 
will be more imitative of a middle-class than 
a working-class model, the low authoritarian’s 
aggressiveness will be relatively uninfluenced 
by the model’s differential social status. à 

The final prediction related to the ethnic 
characteristics of the aggressive model. It was 
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assumed that conflict may be aroused by 
the cognition that one is aggressing (admin- 
istering shock) against an individual and 
justification for the aggression in terms of 
the victim's provocative behavior or the pres- 
ence of negative cognitions regarding the vic- 
tim is lacking. It may be assumed that college 
students’ awareness of the Negro's underdog 
status in American society would contribute 
to their perception of a white model giving 
shocks to a Negro victim as an instance of 
unjustifiable aggression. On the other hand, 
it is plausible to assume that justification for 
aggressing towards a member of a minority 
group may be derived by the prior observa- 
tion that his own group considers him to 
be a legitimate target for aggression. This 
reasoning leads to the prediction that aggres- 
sion towards a Negro will be facilitated by 
the prior observation of a Negro rather than 
a white aggressive model. Furthermore, on 
the basis of previous research (Berkowitz & 
Holmes, 1959; Weatherley, 1961) regarding 
the generalized and undifferentiated nature 
of authoritarian hostility, it is predicted that 
high authoritarians will show less differentia- 
tion among ethnic models relative to non- 
authoritarian subjects. 


METHOD 
Subjects 


Authoritarianism was measured by the 30-item F 
Scale which was group administered to 144 white, 
male, undergraduate students enrolled in introduc- 
tory psychology courses at Wayne State University. 
One-third of the highest and one-third of the lowest 
scorers were randomly assigned to eight experimental 
conditions (N —8 per cell) with the remaining 32 
subjects assigned to a control group in which sub- 
jects were not exposed to an aggressive model. These 
experimental groups reflected a 2X2 X2 factorial 
design based on the following independent variables: 
authoritarianism, high versus low; socioeconomic 
status of the model, low versus middle; and race 
of the model, Negro versus white. 


Behavioral Situation 


Aggression was defined operationally as a response 
which delivers noxious stimuli to another person. 
A modified “aggression machine" (Buss, 1961) was 
eniployed so that five intensities of electric. shock 
ranging from very low to very high could be 
administered to a victim. 

In addition to the white, naive subject, other 
participants included two accomplices, a Negro who 
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played the role of the victim, and a Negro or white 
accomplice who served as an aggressive model, Upon 
arriving at the experimental room, subjects were told 
that they were participating in a study to evaluate 
the effect of shock upon learning. Thus, one partici- 
pant would play the role of a learner, whereas the 
other two would serve as experimenters. In order to 
determine who would play the victim or learner’s 
role, the participants were asked to select a number 
from 1 to 10. Insofar as this procedure was rigged, 
the same Negro accomplice was selected as the 
learner for all subjects. The remaining participants, 
the subject and accomplice, were told that they 
would play the role of experimenters in a study on 
the effects of shock upon learning. This role would 
be played by shocking the learner for incorrectly 
anticipating stimulus words presented serially on a 
memory drum. It was emphasized that the experi- 
menter could press any one of the five buttons 
clearly marked from very low to very high, At this 
point, the remaining accomplice (Negro or white) 
“spontaneously” requested to play the role of the 
experimenter first since his time was limited. In this 
manner, the accomplice always served as the aggres- 
sive model. The naive subject was requested to 
observe and record the accomplice’s selection of 
shock so that level of shock could be related to 
rate of learning. In this manner, the subject was 
given an opportunity to- observe the level of shock 
employed by the secopd accomplice, now serving as 
an aggressive model. 

Depending on the appropriate condition, the 
model was either Negro or white, low or high 
status, In the low status condition, the model 
wore old, disheveled clothes and responded to an 
orally administered questionnaire so as to reveal the 
following information about himself within the hear- 
ing distance of the naive subject; family income, 
less than $3,000; parental occupation, unemployed, 
In the high status condition, the model appeared 
well dressed and responded in the following man- 
ner; family income, $15,000 per annum; parental 
occupation, executive in an advertising firm. 

The Negro victim made a programed series of 
responses such that 32 shocks were administered by 
the model, and subsequently, by the naive subject 
in a seven-trial learning series, Unknown to the 
subject, however, a locked switch precluded the 
actual administration of shock to the victim. Fur- 
thermore, after an initial warm-up period in which 
the model delivered only weak shocks to the victim 
for errors during the first serial presentation, he 
delivered the highest level of shock; namely, “very 
high” shock, for subsequent errors, After the victim 
had learned the correct order of serially presented 
words, the subject was given the opportunity to play 
the role of the experimenter. This time the victim 
was asked to learn a new set of words. Recordings 
of the shock intensities were made by observing 
a series of differentially colored lights located in the 
experimenter’s room and wired to the aggression 
machine. 

Several procedures were employed in order to 
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convince the subject that the aggression machine 
was operative. Prior to the first trial, subjects were 
encouraged to touch the victim's electrodes and 
receive a sample shock. In addition, the subject 
Observed the experimenter carefully place the elec- 
trodes on the victim's wrist and fingertips. Finally, 
the victim emitted appropriate groans subsequent to 
each shock, 


Measure of the Dependent Variable 


The dependent variable, aggression, was opera- 
tionally defined in terms of the intensity of shock 
administered to the victim. In line with Buss' (1961) 
suggestion that the administration of weak or mild 
shock levels may be indicative of a motive to help 
the victim learn more effectively, whereas utiliza- 
tion of very strong shock intensities may be more 
directly indicative of aggression, it was decided to 
score each subject's protocol by counting only 
those shocks whose intensities were labeled as 
"very strong." 


RESULTS 


For the purpose of intergroup comparisons, 
Table 1 summarizes the means and standard 
deviations of the very strong shocks for the 
eight experimental groups. 

An analysis of variance based on these 
Scores indicated that whereas the effect of the 
model’s differential social status upon imita- 
tive aggression was not significant, the sub- 
ject’s authoritarianism and the model’s ethnic 
characteristics were important determinants 
of imitative aggression. Thus, the main ef- 
fects for authoritarianism (F = 16.72, df= 
1/64), and race (F = 10.05, df = 1/64), both 
significant at the .01 level, indicate that high 
authoritarians were more aggressive than 
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lows and the Negro model elicited greater 
aggression relative to the white model. 

The predicted interaction between authori- 
tarianism and race barely misses significance 
at the .05 level (F = 3.90, df = 1/64). It is 
likely that significance would have been 
achieved were it not for a mild heterogene- 
ity of variance (F = 10.60, df = 9). How- 
ever, this interaction does suggest an inter- 
esting trend whereby the Negro and white 
model elicited comparable levels of aggres- 
sion from high authoritarians, whereas the 
low authoritarians’ aggressiveness was differ- 
entiated according to the ethnic character- 
istics of the model. More specifically, low 
authoritarians, although administering gen- 
erally less shock relative to high authori- 
tarians, were more imitative of a Negro than 
a white model (t = 8.20, p < .01). Also, the 
white model was more imitated by high 
than low authoritarians (t = 9.72, p < .01). 
Insofar as the difference in shocks between 
the experimental groups (M — 27.85) and 
the control group (M — 10.63) is highly sig- 
nificant (4 = 8.52, p < .001), it may be con- 
cluded that the observation of an aggressive 
model had a profound effect on levels of 
shock administered by the subjects. Finally, 
Table 1 indicates no support for the predicted 
interaction between authoritarianism and 
social status. 

Discussion 


A primary finding in this study is that the 
imitation of anti-Negro aggression is a func; 


TABLE 1 
MEANS AND STANDARD DEVIATIONS OF SHOCKS OF EXPERIMENTAL GROUPS 


High authoritarian 


Low authoritarian 


Social status M 
Negro White Negro White 
Middle class E 32.78 3222 15.00 29.03 
a .28)« (10.30; 16.00 9.13 
Working class 35.56 Aion br 1) 26.67 
(16.72) (15.11) (5.27) (8.50) 


Mean shock intensity 


High authoritarian 

ied authoritarian 
legro 

White 


34.03 
21.67 
32.64 
23.06 


* Standard deviations are enclosed in parentheses, 


A 
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tion of an interaction between the subject’s 
level of authoritarianism and the model’s 
racial characteristics. More specifically, these 
results support the prediction that whereas 
ethnic models will elicit comparable aggres- 
siveness from high authoritarians, the low 
authoritarian’s aggression is influenced dif- 
ferentially by the model’s ethnic character- 
istics, that is, greater imitation of a Negro 
than a white model, Although these results 
are compatible with previous research (Anis- 
feld, Munoz, & Lambert, 1963; Berkowitz, 
1962; Epstein, 1965; Epstein & Komorita, 
1966) which demonstrated that authoritarian 
hostility is a generalized phenomenon across 
situations, they also suggest that this gen- 
erality may be across models as well as 
targets. Thus, it would appear that the fre- 


' quently reported relationship between author- 


itarianism and ethnocentrism (Adorno et al., 
1950; Pettigrew, 1959) may not only be a 
function of the authoritarian individual's 
vulnerability to frustration as manifested by 
scapegoating behavior, but also a tendency 
to be more imitative of hostile models, 
Furthermore, the low authoritarian's tend- 


ency to be significantly more imitative of a 


Negro than a white model is congruent with 
recent research (Berkowitz, 1962; Weather- 
ley, 1961) which demonstrates greater per- 
ceptual and cognitive differentiation among 
tolerant persons. Unlike these previous find- 
ings, however, the current results suggest that 
the greater discriminability among tolerant 
subjects may occur under nonstressful condi- 
tions, and in relation to aggressive models, 


+ as well as targets of aggression. It is interest- 


ing to note that whereas previous investi- 
gators have reported the greater responsive- 
ness of low authoritarians to environmental 
or situational changes, that is, less childhood 
ethnocentrism as a result of an interracial 
experience (Mussen, 1950), lowered estimates 
of United States’ superiority subsequent to 
the appearance of the sputniks (Mischel & 
Schopler, 1959), the current findings also 
demonstrate similar modifiability in overt 
aggressiveness as a function of external 
conditions. 

There has been increasing recognition that 
a major limitation of traditional social- 
psychological conceptualizations of hostility 


towards outgroups is the neglect of outgroup 
characteristics which facilitate their selection 
as targets (Zawadzki, 1948). For example, 
the predictive efficiency of a specific con- 
ceptualization, for example, the “scapegoat” 
hypothesis, has been enhanced by attention 
to such characteristics, that is, “prior dislike" 
(Berkowitz, 1962), visibility (Williams, 
1947), and social status (Epstein, 1965; 
Epstein & Komorita, 1966). The current 
study indicates the potential utility of further 
exploring the hypothesis that the perception 
of intragroup hostility may serve to justify 
and thereby contribute to the selection of a 
group as a target for aggression. This effect 
may be pronounced even for those individu- 
als, that is, low authoritarians, who would 
ordinarily refrain from imitating an aggres- 
sive model. More specifically, these findings 
suggest that the low authoritarian's anxiety 
or inhibition regarding the expression of hos- 
tility towards outgroups may dissipate when 
these groups are viewed as victimized by their 
own member. The high authoritarian's 
greater imitativeness of the white model rela- 
tive to low authoritarians is suggestive of the 
high Fs greater indentification with the in- 
group. This identification may result in a 
lower threshold for aggressive behavior when 
exposed to a white model, 

An important reservation which may be 
placed on the conclusion that intragroup hos- 
tility within a minority group increases the 
aggressiveness of the majority is that this 
experiment has provided no information re- 
garding the potential modeling effects of 
minority group members other than those 
from the victim’s ethnic group. For example, 
it is conceivable that the use of an Oriental 
model may have elicited a comparable degree 
of aggression relative to the Negro model. 
In this case, one would conclude that aggres- 
sion among minority group members, regard- 
less of the degree of similarity between the 
model and the victim, increases the aggres- 
siveness of the majority. Further research 
will be undertaken in which the ethnic af- 
filation of the subjects and victims as well as 
the models’ will be varied in order to clarify 
these relationships. 

An important implication of these results 
is that the development of attitudes of self- 
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rejection and self-derogation among outgroups 
(Clark, 1963; Lewin, 1935), as these atti- 
tudes may be manifested by intragroup hos- 
tility within a minority group, may serve to 
increase the vulnerability of the group to 
rejection and hostility on the part of the 
majority. This interpretation is consistent 
with the naturalistic observation (Arendt, 
1963), that the Nazis’ aggression towards the 
Jews during World War II was made justi- 
fiable by the majority's perception of some 
Jews participating directly and indirectly in 
the liquidation of their own ethnic group. 

Furthermore, this study may have impor- 
tant implications for current theory regard- 
ing the antecedents of hostility towards out- 
groups. The most prevalent formulation, the 
“scapegoat” hypothesis (Berkowitz, 1962) 
suggests that the anticipation of punishment 
for frustration-induced aggression directed 
towards the ingroup results in displacement 
from the original sources of frustration to 
outgroups. However, this hypothesis is not 
clearly compatible with the naturalistic ob- 
servation of a striking dissimilarity between 
the ingroup frustraters and the victims of 
displaced aggression (Buss, 1961). Attempts 
to clarify this inconsistency between theory 
and observation have focused on the ethno- 
centric individual’s “prior dislike” for out- 
groups (Berkowitz, 1959), as well as his poor 
discrimination under stressful conditions 
(Berkowitz, 1962). The current results sug- 
gest that the direction of hostility may be 
determined by an interaction between person- 
ality characteristics of the aggressor and 
the stimulus characteristic of the aggressive 
model, 

Whereas previous research (Epstein, 1965; 
Epstein & Komorita, 1966) demonstrated 
that the social status of the victim relates to 
his vulnerability to displaced aggression, the 
current study indicates that the model’s social 
status had minimal effect on the imitation of 
aggression. It would appear that the salient 
effects attributable to the ethnic character- 
istics of the model and the victim overshad- 
owed the social status variable. Insofar as 
the social status of the victim was relatively 
undefined and somewhat ambiguous, further 
research might involve the manipulation of 
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the status variable for both the model and 
the victim. 
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A CROSS-CULTURAL STUDY OF NEED PROFILES* 


S. N. GHEI 


University of Vermont 


While a number of studies of national character have investigated the per- 
sonality characteristics of 2 or more cultural groups with respect to several 
variables, quantitative studies designed to determine the degree of congruence 
between the personality profiles of individual members and the characteristic 
group profile have been lacking. The present study investigated the problem 
of differentiating between 2 cultural groups and classifying the individual mem- 
bers of these groups into their respective populations by means of Fisher's 
discriminant function analysis of objective test data. Edwards Personal Prefer- 
ence Schedule was administered to college students, with a predominantly urban 
middle-class socioeconomic background, in the United States and India. 2 sep- 
arate analyses of male and female data using the linear discriminant function 
technique yielded similar results: the 2 multivariate populations were not 
merely discriminated but also classified at a highly significant level. Differences 
on several of the need variables were systematic and highly significant, and, in 


addition, supported by other literature. 


Interest in the delineation of distinctive 
personality characteristics of different na- 
tional groups has greatly intensified among 
psychologists during the last decade. Dif- 
ferences in personality test performance have 
been studied by several investigators (Bren- 
gelmann, 1959; Cattell & Warburton, 1961; 
Cohn & Carsch, 1954; Sundberg, 1956); fac- 
tor-analytic techniques have been applied to 
ratings and questionnaire responses to com- 
pare underlying personality dimensions across 
cultures (Comrey, Meschieri, Misiti, & Nen- 
cini, 1965; McClelland, Sturr, Knapp, & 
Wendt, 1958; Morris & Jones, 1955); and, 
in addition, several studies have explored the 
relationship between personality and the pre- 
vailing social, religious, and political systems 
(Carstairs, 1958; Inkeles, Hanfmann, & 
Beier, 1961; Ross, 1962). 

In general, these studies have neglected to 
determine objectively the degree of congru- 
ence between individual personality and the 
prevailing personality pattern within a given 
culture. An objective evaluation of this rela- 
tionship has many implications for the study 
of national character (Inkeles & Levinson, 
1954, pp. 982, 994—995). 


1This study was supported by the Cooperative 
Research Program of the Office of Education, United 
States Department of Health, Education, and Wel- 
fare. Statistical analyses were performed with the 
assistance of James C. Cobb and Norbert F. Char- 
bonneau, Computing Center, University of Vermont. 


The purpose of the study reported here was 
to compare the personality structure of im- 
portant segments of two diverse cultural 
populations and to illustrate the application 
of Fisher's discriminant function for deter- 
mining the degree of congruence between in- 
dividual personality structure and the char- 
acteristic profiles of the two cultures. 


METHOD 
Subjects 


A total of 453 subjects participated in the study, 
235 from India and 218 from the United States. 
The Indian sample ranged in age from 16 to 24 
with a mean age of 18.9 years; the American sample 
ranged in age from 18 to 25 with a mean age of 
19.5 years. : 

The subjects from India were undergraduates In 
four independent colleges affiliated with the Univer- 
sity of Delhi. They were enrolled in a liberal arts 
curriculum at the sophomore, junior, and senior 
levels. All of them understood English having studied 
it regularly for a period of 6-12 years. A total of 
92% of the subjects were Hindus. Most of the 
subjects indicated three closely affiliated languages, 
namely, Hindi, Urdu, and Panjabi, as their mother 
tongue; the rest of the subjects had Bengali, Telegu 
Tamil, English, Gujarati, or Sindhi as their mother 
tongue. All of them came from joint family house- 
holds with urban, middle-class socioeconomic back- 
grounds. 

The American subjects were undergraduate: 
rolled in a sophomore level course in introductory 
psychology at the University of Vermont. All of the 
subjects were white; the great majority Were e 
middle-class socioeconomic backgrounds and came 
from urban areas of New York and New Jersey a? 
urban and rural areas of Vermont. 


s en- 
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Procedure 


A standardized personality inventory developed 
by Edwards (1959), the Edwards Personal Pref- 
erence Schedule (EPPS), which measures 15 mani- 
fest needs, such as, Achievement, Deference, Auton- 
omy, and the like, was selected for the study. It 
was administered to subjects in India and the 
United States by the investigator within a period of 
2 years. The subjects in India participated in the 
study at the request of the principals of their re- 
spective colleges. Neither the subjects nor the insti- 
tutions were paid for participation in the study. 

The inventory was administered in the original 
form in India to subjects in groups of 23-35 and 
without a time limit being imposed. Most of the 
subjects took about an hour to finish. In the United 
States the inventory was administered during the 
regular 50-minute class hour, as 40 minutes are con- 
sidered adequate for the average college student 
(Edwards, 1959); the class size varied from 60-90. 

In addition to the 15 manifest needs the EPPS 
provides a measure of test consistency based upon 


` a comparison of the numbers of identical choices 


made in two sets of the same 15 items. Edwards 
(1959) has stated that if a subject obtains a con- 
sistency score of less than 9, his scores on the need 
variables may be questioned. For this reason only 
those subjects who obtained a consistency score of 
9 or more were included in the study. This excluded 
19 Indian subjects and 6 American subjects. The 
final sample on which the analysis was based was 
composed of 110 males and 102 females from the 
United States and 106 males and 110 females from 
"India, 
RESULTS 


Reliability 


As a first step split-half reliability coeffi- 
cients were computed for each of the 15 need 
variables over a sample of 108 subjects ran- 
domly drawn from the total Indian sample of 
216 subjects. These coefficients, corrected by 
the Spearman-Brown formula, were as fol- 
lows: Achievement, .40; Deference, .56; 
Order, .61; Exhibition, .60; Autonomy, .58; 
Affiliation, .58; Intraception, .57; Succorance, 
.76; Dominance, .55; Abasement, .64; Nur- 
turance, .52; Change, .70; Endurance, .59; 
Heterosexuality, .88; and Aggression, .61. All 
of the coefficients were significant at less than 
the .01 level. In general, the reliability co- 
efficients were lower than those reported by 
Edwards (1959) for American subjects. This 
is, however, understandable in view of the 
fact that the EPPS was not only originally 
standardized on an American population but 
in addition, the reliability coefficients reported 
by Edwards were computed over a sample 
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considerably larger in size (V = 1509) and 
age range (15-59 years) than that employed 
in this study. 


Congruence between Individual Personality 
Structure and the Cultural Profile 


The larger samples of American females 
and Indian females were divided randomly 
into two comparable halves, One of the ran- 
dom halves (American females, N — 51; In- 
dian females, N — 55) of each of the two 
larger female samples was used in the compu- 
tation of discriminant function, including the 
classification formula and the other two halves 
were set aside for cross-validation purposes. 
A similar analysis was done for the American 
and Indian male data. For a comprehensive 
discussion of the discriminant function tech- 
nique employed in this study the reader is 
referfed to Johnson (1949). 

The test of significance between the two 
female groups on the linear discriminant 
function is given in Table 1. The hypothesis 
of homogeneous groups was rejected at less 
than the .01 level. Similar results were ob- 
tained for the two male groups (Table 2). 
Thus American and Indian subjects may be 
said to differ significantly in their overall 
performance on the EPPS. 

The effectiveness of the linear discriminant 
function in classifying American and Indian 
subjects into their respective populations is 
shown in Table 3. The linear discriminant 
function procedure permitted the correct 
identification of 88.23% of American females 
in the original sample and 80.39% of the 
American females in the cross-validation sam- 
ple. Corresponding figures for the Indian fe- 
males were 85.45% for the original sample 
and 89.09% for the cross-validation sample. 


TABLE 1 


ANALYSIS OF VARIANCE OF LINEAR 
DISCRIMINANT FUNCTION 
(FEMALE SAMPLES) 


Source df SS MS F 
Within groups 89 .0436 .0005 
Between groups 16 .0504 .0032 | 6.4000 
Total 105 .0940 


Note.—The consistency variable was included along with the 
15 need variables in the discriminant function analysis. 
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TABLE 2 
T ANALYSIS OF VARIANCE OF LINEAR 
DiscRIMINANT FUNCTION 
(MALE SAMPLES) 
Source df SS MS F 

Within groups 91 .0242 | .0003 

Between groups 16 .0158 | .0010 | 3.3333 

Total 107 


Note,— The consistency variable was included along with the 
15 need variables in the discriminant function analysis. 


Comparable results were obtained with the 
male data. Thus, the linear discriminant func- 
tion procedure correctly identified 78.18% of 
American males in the original sample and 
72.73% in the cross-validation sample. Cor- 
responding figures for the Indian males were 
83.02% for the original sample and 79.24% 
for the cross-validation sample. In both the 
female and male comparisons, the percentage 
of improvement in classification by the linear 
discriminant function procedure over classifi- 
cation by chance was highly significant. 


Difference in Need Structure 


_ The means and standard deviations, in raw 
score units, for each of the need variables and 
the results of the ¢ test for the comparisons 
between American and Indian females, and 
American and Indian males are given in Ta- 
bles 4 and 5. 


TABLE 3 


A Summary COMPARISON OF CLASSIFICATION 
EFFICIENCY FOR THE LINEAR DISCRIMINANT 
FUNCTION PROCEDURE WITH 
CHANCE EXPECTATIONS 


Correct classification by formula. 


Sample Females Males 


N % N % 


United States 
Original 
Cross-validation 

India 

riginal 
Cross-validation 


45 (51)* | 88.23** | 43 (55)| 78.18** 
41 (51) | 80.39 | 40 (55)| 72.73* 


47 (55) | 85.45** | 44 (53)| 83.02 
49 (55) [52074 42 (53)| 79.24 


quat cec of deviation is from chance value of 


a Figures in parentheses represent the num 
used ineach has Iber of subjects 


**» 01. 
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TABLE 4 
Comparison oF TOTAL FEMALE SAMPLES 
India United States 
(N = 110) (N = 102) 
Variable 
M SD M SD 
Achievement 14.30 | 3.16 | 11.83 | 3.55 E 
Deference 13.33 | 3.59 | 10.41 | 3.20 . 
Order 13.92 | 4.23 | 9.79 | 3.91 ^ 
Exhibition 11.61 | 3.42 | 13.62 | 3.52 | 4. 
Autonomy 12.07 | 4.29 | 12.83 | 4.49 E 
Affiliation 14.13 | 3.74 | 15.41 | 4.29 | 2. 
Intraception 16.17 | 3.99 | 18.24 | 4.33 
Succorance 12.36 | 4.84 | 13.48 | 4.63 fi 
Dominance 13.75 | 3.95 | 12.50 | 4.89 | 2. 
Abasement 16.57 | 4.64 | 17.00 | 4.46 |. . 
Nurturance 18.00 | 3.51 | 17.55 | 3.92 5 
Change 16.93 | 4.20 | 17.07 | 5.51 i 
Endurance 15.85 | 4.75 | 12.08 | 5.10 | 5. 
Heterosexuality| 7.38 | 6.07 | 16.07 | 5.40 | 10.920 
Aggression 13.56 | 4.09 | 12.12 | 4.61 | 2. 
* 
2$ Sim. 
KD «001. 


The American females had significantly 
higher means than Indian females on Exhi- 
bition, Affiliation, Intraception, and Hetero- 
sexuality and significantly lower means on 
Achievement, Deference, Order, Dominance, 
Endurance, and Aggression. The American 
males scored significantly higher than the 
Indian males on Exhibition, Autonomy, and 
Heterosexuality, and significantly lower on 


TABLE 5 
COMPARISON OF TOTAL MALE SAMPLES 
India United States 
(N = 106) (N = 110) 
Variable t 

M SD M SD 
Achievement | 14.24 | 3.17 | 15.13 | 4.90 | 1.57 
Deference 12.48 | 3.48 | 11.17 | 3.65 | 2.68" 
Order 15.07 | 3.60 | 10.99 | 4.89 | 6.924 
Exhibition 11.88 | 4.12 | 13.32 | 3.70 | 2.69" 
Autonomy 13.16 | 3.52 | 1446 | 4.24 | 2.44* 
Affiliation 12.84 | 4.44 | 12.66 | 3.70 | .32 
Intraception | 15.27 | 4.03 | 16.43 | 4.84 | 1.89 
Succorance 9.63 | 426 | 10.34 | 4.64 | 1.16 
Dominance 15.14 | 3.63- | 15.35 | 4.71 | .35 
Abasement 15.93 | 3.89 | 14.96 | 4.61 | 1.66 
Nurturance 1623 | 4.13 | 13:20 | 4.60 | 5.069* 
Change 1511 | 3.43 | 15.82 | 4.46 1129. 
Endurance 16.69 | 4.34 | 1.55 | 5.58 | 3.13% 
Heterosexuality| 12.71 | 6.30 | 16.95 | 5.69 | 5.17 
Aggression 13.62 | 3.91 | 14.69 | 4.31 | 1.90 
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Deference, Order, Nurturance, and Endur- 
ance. 

Furthermore it is of interest to note that 
the combined American female and male 
group (N — 212) had significantly higher 
means than the combined Indian female and 
male group (N = 216) on Autonomy (p< 
.01), Exhibition (? < .001), Intraception (p 
< .001), and Heterosexuality (5 < .001) and 
lower on Deference, Order, Nurturance, and 
Endurance (all ?'s significant at less than the 
.001 level). 


Discussion 


The present findings on the need structure 
of American and Indian students are con- 
sistent with those reported in a previous study 
(Fuster, 1962). On the variables Achieve- 


^ ment, Order, Exhibition, Affiliation, Hetero- 


sexuality, and Aggression for the female 
groups and on Deference, Order, Autonomy, 
Nurturance, Endurance, and Heterosexuality 
for the male groups, the two studies were in 
agreement, The concordance between our 
findings and those of Fuster is all the more 
striking when it is considered that Fuster’s 
Indian sample was collected from a different 


‘region of India (Bombay) than the one used 


in this study and, in addition, the two Ameri- 
can samples were also quite different. 

The data do not permit us to go into an 
extensive characterization of American and 
Indian personality organization, However, it 
seems worthwhile to explore, albeit in a lim- 
ited way, the relationship between the ob- 
served personality structure and certain as- 


. pects of the sociocultural environment. 


Relation to Authority: Autonomy versus 
Deference 

The great strength of need Autonomy in 
American personality organization has been 
noted by several investigators, for example 
Inkeles et al. (1961, p. 212). This finding is 
supported by the present study as well—the 
total American group (A = 212) scored sig- 
nificantly higher on need Autonomy (? < 
.01) than the total Indian group (N = 216). 
Tn the United States the qualities of initiative, 
independence, and individualism are deeply 
reinforced from early childhood both by the 
family and the society. 


. 
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Generally speaking, in India deferent pat- 
terns of behavior are highly reinforced from 
early childhood as the development of indi- 
vidualism and autonomy in the members of a 
joint family would be detrimental to the 
maintenance of the system. Several investi- 
gators have observed the absence of training 
for autonomy in the Hindu family (Cor- 
mack, 1961; Ross, 1961). Thus, children are 
not permitted to go out unaccompanied or to 
speak freely in front of adults (Cormack, p. 
61), also they are “expected to obey their 
parents, especially their fathers, without 
question [Ross, p. 128]." French and Zajonc 
(1957) have noted the presence of considera- 
ble difference between Indian and American 
norms of deference, and Carstairs (1961, p. 
545) has pointed out the overdetermined 
character of *submission to the father figure? 
in the Indian personality. These observations 
are supported by the present study in that 
the total Indian group scored significantly 
higher than the total American group on need 
Deference (p < .001). 


Relationships betweên the Sexes 


Important differences exist between cul- 
tures in the amount and type of sexual be- 
havior that is socially accepted. Need Hetero- 
sexuality, so prominent especially in the 
American male personality organization, was 
weakly manifested in the Indian sample. The 
Indian female had the lowest mean score of 
all on this variable, This is understandable in 
view of the fact that the heterosexuality scale 
is comprised of such items as, “I like to go 
out with members of the opposite sex,” “I 
like to engage in social activities with the 
opposite sex," “I like to tell jokes involving 
Sex," etc., which do not reflect the prevalent 
mores of the Indian society. The relations be- 
tween the sexes are rigidly structured and 
formalized among members of the Indian 
family. Thus, it is not customary for young 
men and women to mix socially, there is no 
dating or courtship before marriage, and it is 
considered entirely inappropriate to show 
affection in public. Contacts between the 
sexes are restricted to members of the imme- 
diate or extended family and the vast ma- 
jority of marriages are arranged by the par- 
ents, 
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It is possible that the observed weak 
strength of need Heterosexuality in the case 
of Indians may have some deeper signifi- 
cance. Thus, for example, there is a long tra- 
dition in India of maintaining sexual activi- 
ties and interests passive before marriage. 
Moreover, the development of a strong het- 
erosexual tendency in a society with a nu- 
clear family system such as in the United 
States would be expected as this is very im- 
portant for the establishment of new and in- 
dependent households. However, this would 
be detrimental to the stability of a joint 
family in which individual interests must 
necessarily be subordinated to those of the 
family. In this connection, it is of interest to 
note that the closest ties in the Indian family 
are not between husband and wife, rather 
they are between mother and son, and sister 
and brother (Ross, 1961, p. 177). 


Comparative Differences in Modal Needs 


Four of the needs, namely, Nurturance, 
Change, Abasement, and Intraception, so 
marked in the Indian female personality or- 
ganization were also strongly manifested in 
the American female personality. The highest 
mean scores were obtained on these four 
variables in both the female groups. By con- 
trast the needs that were most prominent in 
the male personality organization were dis- 
similar across the two cultures. 

The highest mean score of the total Ameri- 
can group in this study was on need Intra- 
ception, a finding supported by Edwards" 
(1959) normative data based on 1509 col- 
lege men and women. The possible explana- 
tion of the superior need Intraception scores 
of Americans may lie in the great importance 
of interpersonal relationships in American 
life. One's role in a highly competitive soci- 
ety requires that one learn to be unusually 
sensitive to the feelings and attitudes of 
others. On the other hand, in the traditional 
Indian society an acute intraceptive under- 
standing of others is less apt to be fostered as 
interpersonal relationships and the expected 
roles thereof are well-defined. 

The highest mean score of the total Indian 
group was on need Nurturance. This is con- 
sistent with the deep sense of responsibility 
observed in the Indian joint family system 


S. N. GHEI 


(Ross, 1961, p. 70). Normally the responsi- 
bility for providing assistance to family 
members, friends, and distant relatives rests 
with the well-off elder male members. How- 
ever, the sense of family responsibility is in- 
grained deeply in the Indian women as well, 
to the extent that, according to Cormack 
(1961) “self . . . had no meaning save in 
relationship to family and serving that family 
[p. 104].” 

This study was primarily guided by consid- 
erations of methodological and theoretical 
progress. It has demonstrated the usefulness 
of applying multivariate statistical tech- 
niques to the analysis of personality struc- 
ture across two cultures. The technique 
proved to be effective not only for the pur- 
pose of discriminating between two multi- 
variate populations but also in determining 
the degree of congruence between individual 
personality structure and the group profile. 
The method holds promise for the empirical 
investigation of a multimodal conception of 
national character in largely heterogeneous 
modern nations. 
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ATTITUDE MANIPULATION IN RESTRICTED ENVIRONMENTS: 
II. CONCEPTUAL STRUCTURE AND THE INTERNALIZATION OF PROPAGANDA 


RECEIVED AS A REWARD FOR COMPLIANCE 1 


PETER SUEDFELD AND JACK VERNON 


Princeton University 


Conceptually complex (abstract) and simple (concrete) Ss underwent 24 hr. 
of sensory deprivation (SD) or nonconfined control (NC) treatment. Towards 
the end of this period, each S had to evaluate the meaning of each of 7 
passages which presented 2-sided information about Turkey. If S responded so 
as to show that the passage was pro-Turk, he was rewarded by the presenta- 
tion of the next passage; otherwise, the questions were repeated. This was a 
test of compliance; internalization was measured by changes on an attitude 
scale presented several weeks before, and again immediately after, the experi- 
mental session. Abstract SD Ss showed a greater degree of compliance than 
abstract controls and concrete SD Ss; there was no difference between the 2 
concrete groups. Concrete Ss evidenced more attitude change (internalization) 
than abstracts; in SD, abstract Ss were less and concretes more persuasible 
than in NC (where the 2 groups were about equal). The results were interpreted 


in terms of conceptual structure theory. 


Several studies have investigated the effects of 
sensory deprivation (SD) vpon susceptibility to 
persuasive messages (see Suedfeld, 1963). In the 
most recent of these (Suedfeld, 1964a), it was 
suggested that the heightened persuasibility of 
SD subjects may be the result of the sub- 
optimal availability of information which char- 
acterizes the deprivation situation and which 
leads to the increased importance of the informa- 
tion presented in the message itself. As a corol- 
lary of this explanation, individual differences 
in persuasibility were hypothesized. Using the 
theoretical approach of Schroder, Driver, and 
Streufert (in press), it was predicted that ab- 
stract persons (individuals who are able to make 
complex integrations of information) would be 
less responsive to propaganda than would con- 
crete subjects (whose conceptual structure is less 
complex and flexible), The results showed that 
SD subjects did change their attitudes more 
than nonconfined controls (NC) after hearing 
a taped propaganda passage, and further that 
concrete subjects evidenced more change than 
abstracts. 

The current study is concerned with the exten- 
sion of this problem. Vernon (1963) has de- 
scribed a method for producing attitude change 
in sensorially deprived subjects. Among other 


1This research, carried out at Princeton Univer- 
sity, was financed by Grant G-27162 from the 
National Science Foundation. 


things, this method involves reinforcing the sub- 
ject who shows the desired attitude change by 
“a little light . . . a novel food item . . . social 
contacts [pp. 30-31].” While this technique 
would probably be quite effective in producing 
change, another type of reinforcement may be 
more subtle and more powerful. 

In an excellent series of papers, Jones and his 
associates (Jones, 1964a, 1964b; Jones & McGill, 
1963; Jones, Wilkinson, & Braden, 1961) have 
demonstrated that SD subjects are motivated 
to obtain informational stimuli. If we consider 
the propaganda material as information, we 
could then reinforce subjects who evidence com- 
pliance by giving them a new propaganda mes- 
sage. Thus, we would have a spiral process in 
which each piece of propaganda would present 
another opportunity for compliance, with the 
next piece of propaganda as the reward. 

The question arises whether behavioral com- 
pliance would lead to actual attitude change— 
internalization, as described by Kelman (1961). 
Obviously, it would be quite easy to respond 
to propaganda in the “desired” way for the sake 
of the reinforcer; this would not necessarily 
involve actual attitude change. 

A related question is that of personality dif- 
ferences. Again referring to the theory of 
Schroder et al. (in press) we would say that 
abstract individuals—whose need for information 
is relatively high and who find SD more un- 
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pleasant than do concretes (Suedfeld, 1964b)— 


would consequently show a higher degree of 
compliance in order to obtain information; being 
capable of more complex conceptual functioning, 
however, they would not feel the need to change 
their actual attitudes towards the subject matter. 
We thus predicted that among abstract subjects 
the SD condition will result in greater compli- 
ance but less internalization than the NC treat- 
ment; in the concrete group, SD subjects should 
evidence more compliance and more internaliza- 
tion than NCs. 


METHOD 


Subjects and Procedure 


A group of 248 male undergraduates of Rutgers 
University volunteered to undergo 24 hours of SD. 
From this group, we chose subjects whose opinions 
about Turkey and the Turks were neutral (see 
Suedíeld, 1964a). Fourteen abstract and 14 concrete 


'.subjects, all of whom met the neutrality criterion, 


were then selected by use of the Sentence Comple- 
tion Test (Schroder & Streufert, 1962). Half of 
each group was randomly assigned to the SD (dark- 
ness, silence, and restricted mobility) and half to 
the NC (nonconfined control) treatment (for com- 
plete description of these two treatments, see 
Suedfeld, 1964a). 


Propaganda Material 


. The material was the same as had been used in 
the previous study (Suedfeld, 19642), which had 
been derived from passages originally devised by 
Murphy and Hampton (1962). In the current ex- 
periment, however, the combined pro- and anti- 
Turk passage was broken down into seven brief 
statements. Each of these consisted of a pro- 
Turkish item followed by a negative item related 
to the same aspect of Turkish life (e.g, “Turkish 
justice is swift and impartial, as seen when... . 
On the other hand, the police force and the courts 
are sometimes over-hasty and harsh; for example 
~... ”). After each two-sided statement, three 
evaluative items of the Turk attitude scale were 
presented on the tape, and the subject responded 
by pressing a button from one to three times. The 
passages were presented during a 1-hour period 
beginning 23 hours after the start of the experi- 
mental session. 


Instructions to Subjects 


Before beginning the experimental session, each 
subject was instructed as follows: 


We are interested in what happens to cognitive 
efficiency under unusual conditions, You have 
probably taken tests where you were supposed 
to answer questions about a passage which you 
had just read; we will ask you to do something 
similar. Sometime during the session, you will 


hear some passages; at the end of each one, you 
will be asked some questions about it. If you get 
the majority of the answers right, there will be a 
short pause; then you will hear a new passage, 
will be asked questions about what it said, and 
so on. If you get the majority of the answers 
wrong, there'll be a longer pause; then the ques- 
tions, but not the passage itself, will be repeated. 
This will go on until you do get the answers 
right, at which time we'll go on to the next 
passage. Remember, we're not interested in your 
own opinion about the topic—just tell us what 
the passage said. 


These instructions were repeated and explained until 
all subjects understood the scheme; subjects were 
also taught how to indicate their answers. 

After these instructions had been given, all sub- 
jects were told that the “comprehension test" would 
be administered approximately 23 hours after the 
beginning of the experiment, SD subjects were then 
confined and were left undisturbed until, 23 hours 
later, they were alerted by a buzzer and the 
propaganda tape began. Control subjects were con- 
ditionally dismissed (see Suedfeld, 19642) and heard 
the passages begin as soon as they were put into 
the SD chamber for that purpose 23 hours after- 
wards. At no time during the experimental session 
did the experimenter know whether a given subject 
was abstract or concrete. 

Pro-Turkish responsese were arbitrarily treated as 
“correct.” Figure 1 shows the process graphically. 
(These time sequences were used for all subjects.) 
Technically, there was no limit on the number of 
“errors” (negative evaluations) possible; in actual- 
ity, the number ranged from 0 to 10, with higher 
numbers indicating less compliance. 

At the end of the session, all subjects were pre- 
sented with the Turk attitude scale used in the 
earlier study; they were told that we wanted to 
know their own personal opinions about the Turks, 
since it was possible that their performance on the 
passage tests might be related to their attitudes. 
As before, the maximum degree of internalization 
of the propaganda was indicated by a change of 
plus 12 points (change in the opposite direction 
would be minus 12 points at most). 


Statenent 1 


| 


Test I (Three itens) 


/ 


S answers "correctly" 


30-nin, pauso 


S answers. "incorrectly 


(At least 2 pro-Turk responses) (Fewer than 2 pro-Turk responses) 


2-rin. pause 


Staterent 2 


Fic. 1. Presentation of propaganda material. 
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TABLE 1 
Mean NUMBER OF "INCORRECT" RESPONSES 


Conceptual structure SD NC 
Concrete 3.71 3.57 
Abstract 1.00 4.29 

RESULTS 


As a measure of the degree of compliance, 
we counted the number of "incorrect" (i.e., anti- 
Turk) responses given as evaluations of the 
two-sided propaganda messages (see Table 1). 
Because the distribution of scores was not 
normal, nonparametric methods were used to 
evaluate the data, When we applied Wilson’s 
(1956) analysis of variance, significance was on 
the borderline for both the treatments effect 
(x? = 3.59, p=.059) and the interaction ef- 
fect (y? = 3.60, p = .057). SD subjects in gen- 
eral were more compliant than NCs; further- 
more, abstract SD subjects were significantly 
more compliant than concrete subjects in the 
same condition (corrected for ties, U — 5.5, 
p<.01, one-tailed). NC-SD differences were 
not significant for concrete subjects, but were 
significant in the abstract, group (corrected for 
ties, U —9.5, 5 =.036, one-tailed). 

Table 2 shows mean attitude change from the 
initial to the postexperimental test. Analysis 
of variance of these data indicates that concrete 
subjects changed significantly more than ab- 
stracts (F = 4.659, p < .05) and that there was 
a significant interaction effect (F= 5.897, p 
<.05); while SD resulted in more change than 
NC for concrete subjects, the opposite was the 
case for the abstract group. 


Discussion 


In this experiment, noncompliance was meas- 
ured by the number of “incorrect” (anti-Turk) 
responses the subject made in evaluating the 
passages. As expected, abstract subjects who had 
undergone SD showed a higher degree of com- 
pliance than did abstract controls; within the 
SD , treatment, abstract individuals complied to 
a higher degree than concretes. Both of these 
findings are in accord with theoretical predic- 


TABLE 2 
Mean Attitupe CHANGE 


Conceptual structure 


su NC 
Concrete 5.86 244 
Abstract 1.00 243 
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tions: being highly information oriented, abstract , 


subjects are stressed by low-information en- 
vironments and thus would be expected to strive 
harder for an informational reward than either 
concretes in the same environment or than 
abstract individuals who are in a relatively 
information-rich situation. It is possible, of 
course, that the difference between the two SD 
groups resulled from differential ability to 
recognize the correct response—but this reason- 
ing fails to explain the difference between the 
confined and the nonconfined abstract groups. 
The fact that concrete subjects complied no 
more in the SD than in the NC treatment was 
not in accordance with predictions. Two pos- 
sible explanations present themselves. One is that 
concrete individuals are so low in information 
motivation that environments as severely sub- 
optimal as SD do not raise information need 
to an appreciable degree; this hypothesis is 


contradicted by a finding that in a relatively .' 


mildly suboptimal game situation concrete sub- 
jects do increase their information-search activ- 
ity (Suedfeld & Streufert, 1964). The second 
interpretation is based upon previous findings 
that relatively extreme environmental pressures 
are needed to produce compliance in concrete 
subjects (Allen, 1962; Janicki, 1960). It may 
be that the pressure provided by the SD situa- 
tion is insufficient to overcome their strong 
resistance to change, which may be a dissonance- 
avoiding technique. In this view, compliance may 
be seen as a relatively complex response in which 
the subject takes an “as-if” attitude and acts 
contrary to his own beliefs (thus arousing dis- 
sonance) in order to obtain a reward. Subsequent 
internalization, by the same token, is à simple 
(or at least simplifying), dissonance-reducing 
behavior. 

The attitude change (internalization) data are 
generally as hypothesized. As in the earlier study 
(Suedfeld, 19642), there was no difference be- 
tween the abstract and the concrete subjects 
in the NC condition. After SD, however, the 
abstract and the concrete groups diverged. 
Greater attitude change on the part of concrete 
SD subjects had been predicted as a result of 
behavioral simplification. Abstract SD subjects 
evidenced relatively little attitude change; this 
datum may be explained by positing that the 
SD situation caused less behavioral simplifica- 
tion in abstract than in concrete subjects. 
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GENERALITY OF SOCIAL SCHEMAS 


RAE CARLSON anb MARY ANN PRICE 
California State College at Fullerton 


Social schemas, as described in Kuethe’s work, were investigated among Ss 
differing in age, sex, and social experience. Free response placements of figures 
representing human silhouettes and geometric forms were obtained from groups 
of preadolescents, community adolescents, delinquent adolescents, and adults. 
Results supported the generality of social schemas described by Kuethe. How- 
ever, age differences, sex differences, or age-sex interactions were found in a 
majority of comparisons. More clearly structured relationships were observed 
among male Ss, with some suggestion of Oedipal themes in schemas of male 
adolescents. Some disturbance in parent-child relationships was noted in schemas 
of adolescents in general, along with a tendency for delinquent Ss to give 


stereotyped, rather than deviant schemas. 


In a series of investigations Kuethe (1962a, 
1962b) has studied the social schemas which 
operate as unit-forming principles in social per- 
ception. Obtaining free-response arrangements of 
figures representing human silhouettes and geo- 
metric forms, Kuethe found random, unorgan- 
ized placements exceedingly rare. Organizing 
principles common to arrangements of all stimu- 
lus materials included tendencies to produce 
linear and height-ordered arrangements. How- 


ever, social schemas clearly emerged in arrange- 
ments: of human figures, producing organizing 
principles which might be expressed as "people 
belong together," *a man and his dog," "mother 
and child,” etc. In other studies Kuethe has 
shown these schemas to operate in the repro- 
duction of displays from memory (Kuethe, 
1962b), and to show generality across different 
response modalities (Kuethe, 1964). In inter- 
preting the evidence for social schemas, Kuethe 
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has proposed that social schemas may have 
clinical implications, suggesting that idiosyn- 
cratic organizations in situations which ordinarily 
elicit common social schemas might "reflect dis- 
turbance of normal social thinking [Kuethe, 
1962a]." 

Underlying this conceptualization of social 
schemas are the assumptions that social schemas 
are learned; that they represent the individual’s 
internalization of conventional ways of organiz- 
ing social stimuli; and that they may reflect dis- 
turbance of social thinking and failure to inter- 
nalize conventional modes of thinking. While 
these assumptions are certainly plausible, they 
need to be tested in a broader population than 
that afforded by the relatively homogeneous 
groups of male undergraduate volunteers studied 
in most of Kuenthe’s work. The present study 
investigates the generality of the social schemas 
described by Kuethe in such a broader popula- 
tion representing subjects differing in age, sex, 
and social experience. 

The general hypothesis was that the variables 
of age, sex, and socialization should influence the 
formation and operation of social schemas. Two 
general considerations led to expectations of age 
differences. Kuethe’s interpretation of the forma- 
tion of social schemas assumes that they are 
learned in childhood, and come to exercise se- 
lective and directive effects in social perception. 
Since this is surely not a one-trial kind of learn- 
ing, it seemed reasonable to assume that social 
schemas would be better established and more 
preemptive at successive age levels. Thus greater 
proportions of subjects should give the modal 
responses at each age level from preadolescence 
through adulthood. 

In addition, one would expect that the special 
characteristics of different developmental levels 
would be reflected in the utilization of various 
social schemas. Assuming that adolescents, as 
compared with children in the latency period or 
with adults, would reflect some degree of con- 
flict in defining parent-child relations, one would 
expect that adolescent subjects should differ 
from  preadolescent and adult subjects in re- 
sponding to stimulus materials portraying parent 
and child figures, 

With regard to sex differences, it was assumed 
that both biological and social factors combine 
to produce somewhat different ways of perceiv- 
ing social situations, However, no specific pre- 
dictions were made as to how these sex differ- 
ences would be reflected in the arrangement of 
figures. 

Following Kuethe's reasoning that failure to 
use common social schemas is related to “dis- 
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turbance in normal social thinking," it was pre- 
dicted that delinquent adolescents, with back- 
grounds of family disorganization and disturbed 
social relationships, should differ from a group 
of community adolescents in their use of social 
schemas. Specifically, delinquent subjects were 
expected to use modal social schemas less fre- 
quently. Further, on the basis of studies pointing 
to some degree of confusion in the delinquent 
adolescent's sex-role identity, greater deviation 
from common responses of like-sexed groups was 
expected on those figures where sex differences 
are obtained. 
METHOD 


The 158 subjects included four subgroups: pre- 
adolescents, aged 7-11 (20 males, 20 females); com- 
munity adolescents aged 13-16 (20 males, 18 fe- 
males) ; delinquent adolescents aged 13-16 (20 males, 
20 females); and adults, aged 25-50 (20 males, 20 
females). The delinquent adolescents, drawn from 
residents of the Orange County Juvenile Hall, were 
of normal intelligence, and with the exception of 
four cases, not obviously educationally retarded. All 
of the other groups included volunteers drawn from 
middle-class neighborhoods. 

Nine sets of stimulus figures, described in Kuethe's 
(19622) original report were adapted for administra- 
tion outside a laboratory setting. White felt-backed 
figures were displayed on a 30 X 40 inch black felt- 
covered board mounted on a tripod. The nine sets of 


figures included: man, woman, child; man, woman, | 


dog; man, child; woman, child; man, woman, two 
rectangles; three men, three rectangles; three women, 
three rectangles; three rectangles; square, circle, 
triangle. 

All subjects were tested individually, presented 
with stimulus materials in random order, and asked 
to arrange the figures on the board. The experimenter 
recorded the order of arrangement and measured 
distance between figures aíter each presentation, A 
female experimenter conducted all of the sessions 
with the exception of the male Juvenile Hall group. 
However, no differences were observed between 
male delinquent subjects who had a male experi- 
menter and the community adolescent males who 
had a female experimenter. 

In testing hypotheses, frequency comparisons of 
subjects showing presence or absence of specified 
social schemas were evaluated with chi-square tests. 
The nine figures yielded eight major comparisons 
since two sets of figures (man-child, woman-child) 
produced a single measure of relative distance be- 
tween parent and child. 


RESULTS AND DISCUSSION 


The findings provided considerable evidence 
that the modal schemas described by Kuethe are 


1We are grateful to Dale Hayden for his help in 
examining Juvenile Hall subjects. 
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rather stable ways of organizing perceptions, 
For five of the nine sets of stimulus figures, 
Kuethe’s modal responses were the modal re- 
sponses in all eight of the present subgroups. In 
only one set (MWD) did the majority of the 
present subgroups deviate from Kuethe’s mode. 
Height-ordering tendencies in arranging man, 
woman, child and the three rectangles emerged 
strongly, as did tendencies toward linear arrange- 
ments and toward grouping human figures with- 
out letting rectangles intervene between humans, 


Age Differences 


On all but one of the eight major comparisons, 
age differences or age-sex interactions were ob- 
served. Adults clearly grouped human figures 
more often than did adolescents, and the ado- 
lescents, in turn, grouped human figures more 
than did preadolescents. These age differences 
were clearly significant in arranging the three 
men-three rectangles (x* — 14.93, df —2, p< 
.001) and the three women-three rectangles (x? 
= 13.78, df = 2, p < .01). Younger children were 
less likely to use height-ordered arrangement of 
three rectangles as compared with older subjects 
(x? = 4.32, df — 1, p < .05), and showed a small, 
but unreliable tendency to center the dog on the 
MWD set of figures. 

Placements of parent-child figures revealed 
some deviation expected in adolescent responses. 


' In arranging the man, woman, child set, the ado- 


lescents tended to use the modal MWC order 
less frequently than preadolescent or adult 
groups (p< .10). Among males only, this differ- 
ence attained statistical significance (x? — 6.34, 
df=2, p «.08), suggesting some support for 
the notion that some confusion or turmoil about 
parent-child relationships is part of the adoles- 
cent's experience. A further age-sex interaction 
was observed in placements of man-child and 
woman-child figures. Again, among males only, 
the adolescents, community and delinquent alike, 
showed a much stronger tendency to place the 
child nearer the mother. While this is the mode 
for six of the eight groups, the schema is sig- 
nificantly stronger among adolescent males as 
compared with preadolescents or adults (x?= 
5.58, df — 1, p < .02), suggesting, perhaps, that 
some Oedipal notions may be components of 
social perception in the adolescent male. 


Sex Differences 


In addition to the age-sex interactions already 
noted, two further sex differences were observed. 
On the man, woman, dog figure there emerged a 
reliable tendency for each sex group to center 
the like-sexed figure (x? = 6.88, df—1, p< 
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02). A further sex difference was found in the 
placements of the geometric forms square, circle, 
triangle, Males were significantly more likely to 
produce a vertical arrangement of these figures 
(xX? — 741, df= 1, p € .02). Moreover, this sex 
difference interacts with age so that there is an 
increasing tendency for males to give a vertical 
response with increasing age (x? — 10.68, df — 
2, p<.01). The interpretation of this sex dif- 
one is clearly beyond the scope of the present 
ta, 


Socialization Differences 


Surprisingly little of the variation in social 
schemas was attributable to the gross differences 
in social experience between community and de- 
linquent adolescents, Only one of the eight com- 
parisons produced a significant difference. In 
placing the three rectangles, the Juvenile Hall 
subjects were much more likely to give the modal 
height-ordered arrangement (x?— 13.88, df— 
1, p < .001). This finding contradicts our hy- 
pothesis, by pointing to greater conformity rather 
than deviation on the part of the delinquent 
group. The rather stereotyped response of the 
Juvenile Hall subjects was in marked contrast 
to the more individual and creative arrange- 
ments of geometric forms which the community 
adolescents gave fairly frequently. 

The expectation that the delinquent adoles- 
cent should experience a more tenuous sense of 
his sex-role identity was not supported, Two 
trends were consistent with the hypothesis, but 
fell short of statistically reliable differences. In 
placing man, woman, dog, the delinquent sub- 
jects tended to reverse the more general tend- 
ency to center the like-sexed figure. In addition, 
the delinquent girls were less likely to place 
mother and child close together than were the 
community adolescent girls. 

Thus the overall findings suggest that the 
differences in social experience implied in the 
status of delinquency make remarkably little 
difference in the present social perception task. 
These findings are consistent with Kuethe and 
Weingartner's (1964) observations that peniten- 
tiary inmates, despite their difference in socio- 
economic status and social experience, gave re- 
sponses virtually equivalent to those of male 
college students, with only the clearly homo- 
sexual subjects departing from these normative 
expectations. One possible interpretation is the 
familiar observation that delinquent and crimi- 
nal persons do not differ from the general 
population in their knowledge of social expecta- 
tions, although they do not always act upon this 
knowledge. 
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The general implications of the study seem to 
be these: On one hand, the findings offer con- 
siderable support for Kuethe's interpretation of 
social schemas as general organizing tendencies 
in social perception, as well as evidence for the 
learning of social schemas reflected in the age 
trends noted here. On the other hand, the data 
suggest that influences of sex and of develop- 
mental level limit the generality of specific social 
schemas. In a more speculative way, the findings 
may also suggest that children structure their 
experiences in more spontaneous, individual ways 
than older, more thoroughly socialized people; 
that the impact of adolescence—especially 
among males—may be more preemptive in struc- 
turing experience than the more often empha- 
sized effects of distorted social experience, and 
that the effects of age and sex—clearly biological 
as well as social in their implications—may make 
more difference than the socialization variable 
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in this kind of social perception. Finally, it 
should be noted that the present data, like the 
findings of other investigators of a wide range 
of psychological problems, point to more and to - 
clearer relationships among males than among 


females. 
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STRUCTURE OF BOREDOM * 


P. JAMES GEIWITZ? 


University of Michigan 


The human experience of boredom is studied in relation to arousal, constraint, 
subjective repetitiveness, and unpleasantness. Intense boredom induced by a 
simple repetitive task is found to be associated with decreased arousal and in- 
creased constraint, repetitiveness, and unpleasantness. In an attempt to syn- 
thesize boredom, induction of each independent variable by means of post- 
hypnotic cues indicates significant effects for arousal and constraint but not 
for repetitiveness and unpleasantness. No single variable is found necessary 
for boredom although the evidence suggests that normally all 4 factors are 
present. Implications of findings for current boredom theories are discussed. 


Though boredom is certainly a problem of 
increasing practical and theoretical importance, 
psychologists have made little progress toward a 
molecular theory. There exist some molar con- 
cepts that are of use in industrial settings, but 
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the basic questions remain. What is boredom? 
What causes boredom? What are its effects? 
Whenever boredom is discussed, certain con- 
structs are mentioned. Berlyne (1960) and Hebb 
(1955, 1958) have stressed the role of arousal. 
Unfortunately, they disagree as to the level of 
arousal to be associated with boredom: Hebb 
suggests a lowered level while Berlyne favors 
a high level interpretation. Hebb has empirical 
backing in studies done by Barmack (1937, 1938, 
1939b, 1939c, 1940; Seitz & Barmack, 1940), by 
McBain (1961), and by Heron (1957). Berlyne, 
not unaware of this evidence, explicitly rejects the 
"temptation" and formulates a high arousal 
theory, primarily on the basis of research by 
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Sokolov and his associates (Roger, Voronin, & 
Sokolov, 1958; Vinogradova & Sokolov, 1955) 
and of his own theory of RAS functioning 
(Berlyne, 1960). The question is very much an 
open one. 

A second construct often presumed to be 
related to boredom is monotony. Industrial re- 
search typically centers on this variable. With 
monotony held constant, however, differences in 
boredom are reported between groups varying 
on a number of dimensions. To cite the stereo- 
type, the person less likely to be bored by a 
given task is stupid (Burnett, 1925; Korn- 
hauser, 1922; Wyatt, 1927; Wyatt, Fraser, & 
Stock, 1929; Wyatt, Langdon, & Stock, 1937), 
old (Heron, 1952; Smith, 1955), uncreative 
(Wyatt et al, 1937), with a dull “real” life 
(Smith, 1955) and a meager fantasy life (Bar- 
mack, 1937; Smith, 1955). Such individual dif- 
ferences suggest that monotony objectively de- 
fined as an attribute of the situation is less 
important than the subjective feeling of repeti- 
tiveness. This feeling is influenced by the situa- 
tion one is in, of course, but it also reflects indi- 
vidual characteristics and personal motivations. 

A third construct of import might be termed 
constraint. Barmack (1939a), when asked to 
distinguish between boredom and satiation, re- 
plied that satiation is a point at which a subject 
will voluntarily reject the task whereas boredom 


' occurs if the subject is compelled to remain at 


the task after the satiation point. Fenichel 
(1951) phrased it this way: Boredom “arises 
when we must not do what we want to do, or 
must do what we do not want to do [p. 359].” 
Empirical evidence is sparse on this construct, 
but perhaps some of Karsten’s (1928) work on 
satiation is relevant. When the experimenter of- 
fered mild suggestions to continue after the 
satiation point had been reached, she found that 
her subjects had “negative valence” toward the 
task, Performance deteriorated sharply and com- 
plaints of fatigue increased, Such results indicate 
increasing boredom. 

Some researchers in the field of sensory de- 
privation (Freedman, Grunebaum, & Greenblatt, 
1961) have hinted that degree of constraint 
may be a factor in the boredom produced in 
deprivation settings. 

‘A final candidate for a major role in boredom 
is general negative affect or unpleasantness. 
Everyone assumes that boredom is unpleasant 
and Block’s (1957) study lends some introspec- 
tive support. Again, however, there is contro- 
versy here about the reason why boredom is 
unpleasant (if, indeed, it is). Berlyne (1960) 
and Fenichel (1951) take a traditional stand, 


suggesting that the unpleasantness in boredom 
is caused by the presence of a high drive state. 
Hebb (1949) argues that low drive or arousal 
produces the unpleasantness through disorganiza- 


tion of neural firing. 


Thus we have four constructs that might allow 
an embryonic theory if their relationships to 
boredom were known. The purpose of this 
study is to obtain empirical evidence on these 
relationships. 


METHOD 


Four male subjects served in this experiment. All 
were selected from student volunteers for paid re- 
search at the University of Michigan. Since the 
procedure utilized hypnotic induction, subjects were 
selected primarily on the basis of their scores on 
Form A of the Stanford Hypnotic Susceptibility 
Scale (SHSS; Weitzenhoffer & Hilgard, 1959); only 
those scoring 11 or 12 (maximum score = 12) were 
retained. The lack of personality correlates of hyp- 
notic susceptibility (Hilgard, in press) would indi- 
cate that such a selection procedure does not lead 
to great sample bias. 


Overview of the Experiment 


Training. The subject began the experiment with 
a training session designed to enable him to reliably 
identify various degrees of boredom and interest. 
Three degrees of each were induced by means of 
posthypnotic cues. With amnesia for the presented 
cue, the subject labeled his experience by saying one 
of the cues, the series of which, in effect, became 
a boredom-interest self-rating scale. Training con- 
tinued until the subject’s accuracy reached 75% 
or better. 

Natural series. Here various levels of boredom (as 
assessed by the subject’s rating) were induced by 
varying durations of a simple repetitive task: making 
checks on a piece of paper. Performance decrement 
(quality and placement of checks) was assessed to 
partially validate the verbal rating of boredom level. 
Time estimates were used for the same end. The 
dependent variables were levels of arousal, con- 
straint, unpleasantness, and repetitiveness as assessed 
by self-rating scales. This phase thus constituted the 
analysis of boredom in terms of the constructs 
presumed to be operating. 

Partly synthetic series. The synthesis of boredom 
was attempted by inducing various levels of arousal, 
constraint, unpleasantness, and repetitiveness by 
means of posthypnotic cues. Boredom becomes the 
dependent variable. Each of the four independent 
variables was individually induced in various levels 
and the degree of boredom was indicated by the 
subject's verbal report. The amount of behavioral 
activity was observed and rated to partially validate 
the boredom rating. For the same purpose, the sub- 
ject made time estimates while in the different states, 

Wholly synthetic series, This series was identical 
to the preceding except for the instruction (given 
to the subject under hypnosis) to keep each of 
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the “other” independent variables constant at a 
neutral level while the one was being manipulated. 
For example, when constraint was induced at a 
high level, the subject was to keep unpleasantness, 
arousal and repetitiveness constant at a normal 
degree. Thus the effect of each factor in isolation 
could be assessed. 

Factorial series. With arousal held neutral, con- 


straint, unpleasantness, and  repetitiveness were 
manipulated conjointly in a 2X2X2 factorial 
design. 


Operational Criteria 


Boredom. Boredom and interest were indicated 
primarily by the following cues: B3, very, very 
bored; B2, fairly bored; B1, slightly bored; I3, very, 
very interested; 12, fairly interested; Il, slightly 
interested; 0, not bored, not interested. 

The subject was instructed under hypnosis that 
whenever he heard or saw one of these cues (in 
the laboratory only) he would respond with the 
appropriate degree of boredom or interest. In addi- 
tion, whenever asked to describe his experience of 
boredom or interest, he was to reply by giving the 
cue closest to his actual feeling. The subject was 
free to interpret boredom as it had meaning for 
him in real life, but inquiries were made under 
hypnosis to insure that this interpretation was 
common and not idiosyncratic. 

Tn various stages of the study, concurrent measures 
theoretically related to boredóm were taken to sub- 
stantiate the verbal rating. In the natural series, 
performance decrement on the simple repetitive task 
was rated. In both synthetic series, the subject was 
observed (without his knowledge) through a one- 
way mirror as he sat alone in the experimental room. 
Two raters without knowledge of the posthypnotic 
cues to which he was responding independently 
rated his level of behavioral activity on a 7-point 
scale running from 1, "very withdrawn, weary," to 
7, “fairly alert and responsive.” 3 It was assumed 
that more boredom would be reflected in lower 
ratings. Finally, the subject made production esti- 
mates (Bindra & Waksberg, 1956) of 10-second 
intervals while in the various states; previous 
research (Geiwitz, 19642; Loehlin, 1959) suggested 
that more boredom would be reflected in greater 
overestimation. 

Arousal. Arousal as an independent variable was 
induced with posthypnotic cues as follows: +AA, 
mental arousal at a fever pitch, corresponding to a 
state of great excitement (but not nervous or 
upset); +A, mental arousal halfway between 0 and 
-FAA; 0, mental arousal at normal waking level; 
—A, mental arousal halfway between 0 and —AA; 
—AA, mental arousal corresponding to what it is 
at the deepest stage of hypnosis or in sleep (but 
not actually asleep). 


3 Geiwitz (1964b) includes the details of pro- 
cedure. The reader interested in the behavioral 
rating scales, the self-rating scales, verbatim instruc- 
tions, and other details is asked to refer to that 
manuscript. 
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Arousal was carefully described to exclude any 
sensorimotor emphasis and to stress the purely 
cognitive aspects. As defined for the subject, it was 
the general level of mind activity, a volume control, 
so to speak, which could be turned up or down by 
the cues. . 

Arousal as a dependent variable was assessed by 
two selí-rating scales designed to indicate degree by 
either of the two aspects the subject wished to 
emphasize. The first ran from 1, “very, very tired,” 
to 9, “very, very alert”; the second ran from 1, 
“mind extremely active,” to 8, “mind mostly a 
blank.” That scale which correlated higher with the 
posthypnotic cues for arousal was used in the 
statistical analyses. 

Constraint. The subject was told that what we 
meant by the subjective feeling of constraint was “a 
feeling that if you were perfectly free to do anything 
you wanted to do, you would not be doing what 
you are doing, you would choose to do something 
else.” As a dependent variable, constraint was as- 
sessed by a self-rating scale which ran from 1, 
“very, very much like to do something else,” to 
7, “like doing this.” The subject was instructed to 
use this scale on the basis of its less emotional 
aspect, a sort of unemotional recognition of being 
compelled to do something, as distinguished from 
the emotional aspects implied by the perhaps 
unfortunate inclusion of the word “like.” 

The posthypnotic cues indicating levels of con- 
straint as an independent variable were phrases taken 
from the self-rating scale: “content doing this" 
(low constraint), "like to do something else" 
(medium), “very much like to do something else" 
(high). 

Unpleasantness. The degree of unpleasantness ex- 
perienced by the subject on a trial was assessed by 
a self-rating scale running from 1, * definitely 
pleasant,” to 9, “definitely unpleasant.” As an 
independent variable, the posthypnotic cues (taken 
from the scale) were: “possibly on the pleasant side” 
(low), “mildly unpleasant” (medium), “definitely 
unpleasant” (high). Là 

Repetitiveness. The self-rating scale (dependent 
variable) running from 1, “endlessly repetitive,” to 
7, “not at all repetitive,” was accompanied by clari- 
fying instructions. The subject was told to use this 
scale in reference to his subjective feeling of how 
repetitive the situation was, as distinguished from 
any sort of objective assessment of the factor. The 
posthypnotic cues used were: "not noticeably re- 
petitive” (low), “fairly repetitive” (medium), “very, 
very repetitive" (high). 


Design and Procedure 


Natural series. The subject sat alone in a small 
room facing a table on which several sheets of 
Champion 636 data paper were placed. At the signal 
“start” communicated through earphones from the 
experimenter in an outer observation room, the 
subject began making checks at a previously learned 
rate of about 40 per minute, counting aloud from 1 
to 10 (over and over) as he did so. Six different 
task durations were used: 12, 6, 4, 3, and 1 minutes, 


i 


— 


BRIEF ARTICLES 


and 30 seconds. At the signal “X” the subject 
stopped making checks and visualized a mental 
image of an X.* By prior instruction under hyp- 
nosis, whatever mental state pertained at the X 
signal was to be maintained until the completion of 
the second time estimate (see below). After visuali- 
zation of the X, the experimenter gave another 
signal—“begin.” When the subject thought that 10 
seconds had elapsed from that signal, he said 
“stop.” By instruction, he then gave his rating of 
his boredom or interest by saying one of the cues. 
A second time estimate followed. Finally, the sub- 
ject, now in a normal waking state, reported his 
level of arousal, constraint, unpleasantness, and re- 
petitiveness from the self-rating scales, Each trial 
was separated from the next by a short inquiry 
about the ease or difficulty of using the self-rating 
scales. 

The actual time elapsed in the time estimates was 
recorded from a stopwatch. The subject had been 
instructed not to count to himself while estimating. 

Two trials per subject per duration constituted 


*. the natural series. Order of durations was random- 


ized independently for each subject. 

Partly synthetic series. The subject sat alone in 
the inner experimental room facing a small table 
containing a number of index cards. At the signal 
“turn” from the experimenter in the outer room, the 
subject turned over the top card and immediately 
began to respond to the posthypnotic cue written 


‘there. He then placed the card face down in a box 


next to him, thereupon forgetting what was written 
on the card although continuing to experience the 


«proper feeling. The cue was one of the three levels 


of constraint, unpleasantness, or repetitiveness or 
one of the five levels of arousal. The effect of the 
cue was to last until the second time estimate, as in 
the natural series. 

The assessment battery was identical to that used 
in the natural series. Thirty seconds after the sub- 
ject turned over the top card, the experimenter said 
uX,” which began the assessment. The 30 seconds 
preceding the assessment were used to observe the 
subject’s behavioral activity. i 

Two trials per subject per cue—a total of 28 trials 
per subject—constituted the partly synthetic series. 
These 28 trials were divided into three sessions: a 
randomized sequence of 10, composed of two trials 
for each of the arousal cues; a randomized sequence 
of 9, one trial for each of the other conditions; 
another session like the second, that is, the second 
trial for each of the other cues. 1 

Wholly synthetic series. The procedure for this 
series was identical to that used in the previous 
series. The subject, however, had been instructed 
to hold the three “other” independent variables 


4 Visualization of mental images was included 
under the assumption that it would reflect level of 
arousal. Work directly testing this assumption was 
done concurrently by G. S. Blum. His results suggest 
a relationship but by no means a simple one. Further 
discussion of mental images is therefore omitted 
from this study. Ms 
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constant at a neutral level while responding to the 
one written on the card. The neutral degree was 
carefully specified for each condition: arousal, the 
zero condition; constraint, “50:50, don’t mind doing 
this, don’t mind doing something else”; unpleasant- 
ness, “neither pleasant nor unpleasant”; repetitive- 
ness, “not noticeably repetitive.” 

To give a sample trial, the subject might turn 
over a card with the phrase cue, “Like to do some- 
thing else.” He was to respond with that level of 
constraint (medium) while simultaneously holding 
arousal at the normal waking level, unpleasantness 
at “neither pleasant nor unpleasant,” and repetitive- 
ness at the degree signified by “not noticeably re- 
petitive.” 

Only the three lower levels of arousal were used 
(0, —A, —AA), enabling us to complete this series 
in two blocks of 12 randomized trials, 

Factorial series. The factorial series was the only 
one in which the independent variables were varied 
conjointly. Two levels (high and low) of con- 
straint, unpleasantness, and repetitiveness were in- 
duced in a 2X2 X 2 factorial design. Arousal was 
held constant at the zero level on all trials. 

The subject sat alone in the inner experimental 
room, as before. The experimenter read the phrase 
cues depicting the levels of constraint, unpleasantness, 
and repetitiveness to be assumed on each trial. After 
15 seconds, the experimenter said “X” and then 
asked for the subject’s rating of boredom or interest. 

With eight possible treatment combinations and a 
desired two trials per combination, we could have 
completed this series in 16 trials. Instead 19 trials 
were run, with 5 replications of the base-line com- 
bination (low of all 3) instead of 2 in order to 
assess any order effects, The 19 trials were com- 
pleted in one session; order was counterbalanced. 


RESULTS 
Validation of Boredom Reports 


In the natural series, subjects made checks on 
paper for varying durations, then reported their 
degree of boredom. If their reports truly re- 
flected boredom, one would expect that the qual- 
ity of their performance on the task would be 
related to their report. The checks were rated 
independently by two judges (median interrater 
reliability = .91) and the results are shown in 
Part A of Table 1. The 12 trials for each sub- 
ject were divided as closely as possible to a 
median split on the basis of degree of boredom 
reported. 

For all four subjects, the greater decrement is 
associated with more boredom, 

Behavioral observation of the subject’s level 
of activity in the 30 seconds from cue presenta- 
tion to assessment in the two synthetic series 
provided another validation measure. Two inde- 
pendent observers were used (median reliabil- 
ity = .85). As mentioned previously, the ob- 
servers did not know to which cue the subject 
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TABLE 1 
VALIDATION MEASURES FOR REPORTED BOREDOM 


A. Performance decrement^ 


Subjects 
Level of 
boredom^ 
A B c D 
Low 2.18 2.25 2.13 3.23 
High 4.46 3.06 2.44 3.83 
B. Behavioral observatione 
Subjects 
padaka Subseries 
A B Cc D 

Low | 6.37 |3.99 | 3.87 | 4.73 | Partly synthetic, I 
High | 2.56 | 2.88 | 3.05 | 3.89 

Low 5.92 | 5.15 | 3.76 | 5.28 | Partly synthetic, II 
High |3.14|4.74| 3.47 | 4.89 

Low  |5.76|4.13 | 3.34 | 4.43 | Partly synthetic, III 
High |3.10|3.76 | 3.20 | 4.27 

Low  |475|3.91|3.94 | 4.43 | Wholly synthetic, I 
High | 3.92 |3.69| 3.30 4.14 

Low [5.19 [3.24 | 4.81 | 3.78 | Wholly synthetic, IT 
High | 3.74] 3.00) 3.97 | 3.27 


a Rating scale for performance decrement: from 1, "Better 
than average" (average equals performance at start of Trial 1), 
to 6, "Extreme decrement.” 

b Approximate median split. 

e Rating scale for behavioral observation: 1, "Very with- 
drawn, weary,” to 7, “Fairly alert and responsive.” 


was responding, nor did the subject know he was 
being observed (according to the inquiry fol- 
lowing the experiment). With again a median 
split on the basis of reported boredom, one 
would expect less activity (more withdrawal) to 
be associated with greater boredom. The results 
are shown in Part B of Table 1. Since the data 
from the partly synthetic series were gathered in 
three sessions and those from the wholly syn- 
thetic in two, we have five observation sessions 
per subject. As Table 1 shows, all five compari- 
sons are in the expected direction for every sub- 
ject, rather conclusive evidence that the sub- 
ject’s report was not isolated from other observ- 
able signs of boredom. 

The estimates of 10-second intervals were ex- 
pected to show greater overestimation with more 
boredom. This assumption failed; in general, the 
estimates were approximately equal in mean 
value for high and low boredom. Without mini- 
mizing these results, later inquiry suggested that 
the subjects did not make their estimates in a 
way conducive to mean differences. We had ex- 
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pected time to pass more slowly as boredom 
increased. While all subjects reported this to be 
so, they also suggested that more boredom made 
them “lose track of what they were doing"—a 
reasonable expectation, had we considered it. 
Inattention during the estimation task would 
tend to produce longer estimates (underestima- 
tion by this method) while the "slow time" 
would have just the opposite effect. Thus high 
boredom seems to have produced two effects 
which, over many trials, cancel each other— 
hence, no mean differences. 

Three of the four subjects showed signifi- 
cantly greater variance among estimates made on 
“more boredom” trials, as would be expected 
from our revised, post hoc assumption. 

The overall validation picture, however, is 
good, The performance decrement and behavioral 
observation results are in ample agreement with 


the boredom reports. The failure of time esti- 


mates to relate to reports has a reasonable ex- 
planation other than lack of validity. 

Inquiries following the experiment gave evi- 
dence that the experiences of boredom were real 
to the subjects. All said it was about the same 
in quality as that they experienced in “real life”; 
two said it was slightly greater in intensity, one 
said it was about the same, and the fourth re- 
ported it to be slightly weaker. 


Results of Primary Analyses 


Natural series. Analysis of the boredom state 
takes the form of correlations of reported bore- 
dom induced by varying durations of a simple 
repetitive task with arousal, constraint, unpleas- 
antness, and repetitiveness as assessed by self- 
rating scales. Table 2 gives the results for each 
subject in each of the two sessions of the natural 
series. Signs of the correlations have been re- 
versed in some cases so that a positive sign indi- 
cates the correlation of more boredom with low 
arousal and high intensities of the other three 
factors. 

All variables exhibit generally high correlations 
in the positive direction. As boredom increases, 
arousal decreases while constraint, unpleasant- 
ness, and repetitiveness increase. 

Partly synthetic series. 'This phase was designed 
to test the effect of the four factors as inducers 
of boredom. Taking the average of the two bore- 
dom reports in each condition, the Friedman 
analysis of variance for rank order of condi- 
tions was computed. The means and analysis 
are given in Table 3. 

All factors reach significant levels of effect 
except arousal, which approaches significance. 
Inspection of the means shows that induction of 
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TABLE 2 
CORRELATIONS OF BOREDOM WITH AROUSAL, UNPLEASANTNESS, CONSTRAINT, 
REPETITIVENESS, AND TASK DURATION IN Two SESSIONS 
or NATURAL SERIES (N = 6) 
Subjects 
Variable A c D 

I Il I Ir I Il I Ir 
Arousal .99 .99 .49 91 95 .88 78 
* Unpleasantness 125 .93 .88 : — 87 .98 .89 
Constraint 78 99 .88 :93 87 .90 91 92 
Repetitiveness 94 97 -70 95 -83 91 81 .92 


^ Correlation cannot be computed because one “variable” does not vary. 


more boredom is caused by lower arousal, higher 
constraint, higher unpleasantness, and higher 
repetitiveness. 

Wholly synthetic series. The partly synthetic 


` series, by design, did not provide a pure test of 


the effect of each factor on boredom because 
when one of the four was induced, the other 
three were free to vary. They usually did. For 
example, when high unpleasantness was cued, 
the assessment scales showed an accompanying 
increase in constraint and repetitiveness and a 
decrease in arousal. It was almost as if each 
cue sparked a redintegration of the entire com- 
plex that was associated with boredom in the 
natural series. 


TABLE 3 


BOREDOM as A FUNCTION OF AROUSAL, UNPLEASANT- 
NESS, CONSTRAINT, AND REPETITIVENESS IN 
PARTLY SYNTHETIC SERIES 


Condition and degree M Xr 


Arousal 


8.75* 


Unpleasantness 
Low 
Medium 
High 

Constrant 


8.00** 


B RNS Sangre 


8.00** 


ipg a boao 
beS SS a 


Repetitiveness 
Low 
Medium 
High 


Hwe mee mye mopon 


See 
1 


Note,—Scale equivalents of boredom: B: 
bored; B2 = 2; B1 =3; O = 4; I1 = 5; I2 
very, very interested. 

*p <.10. 

"p = 01. 


The wholly synthetic series, however, showed 
little of this redintegration effect since subjects 
were under instruction to hold the other three 
constant at a neutral level when one was in- 
duced, It therefore functions as a test of the iso- 
lated effect of each factor. Table 4 gives the 
means and Friedman analysis. 

The general picture is one in which low 
arousal and high constraint are significant fac- 
tors in boredom, The effect of unpleasantness 
reaches trend level but that for repetitiveness is 
not nearly significant.” 

Factorial series. The final part of the experi- 
ment was a test of the effects of constraint, un- 
pleasantness, and repetitiveness varied factorially 
with arousal held constant at a neutral level, 
Over the sequence of 19 trials in the series, the 
base-line combination (low degree of each 


TABLE 4 
BOREDOM AS A FUNCTION OF AROUSAL, UNPLEASANT- 
NESS, CONSTRAINT, AND REPETITIVENESS 
IN THE WHOLLY SyNTHETIC SERIES 


Condition and degree M xe 
Arousal 
0 4.19 
—A 3.06 6.5* 
—AA 2.38 
Unpleasantness 
Low 4.50 
Medium 2.94 4.6 
High 244 
Constraint 
Low 4.06 
Medium 3.12 71% 
High 2:19 
Repetitiveness 
Low 3.75 
Medium 3.12 11 
High 2.31 


Note.—Scale equivalents of boredom as in Table 3. 
* p < .05. 
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TABLE 5 
MEANS AND ANALYSIS FOR FACTORIAL SERIES 
Source df MS F 
Between sat 2. 
Within subjects 
Levels of pasen ess (A) d pes 32.08* 
A X Subjects : 4 
Levels of constraint (B) 1 13.85 419.70** 
B X Subjects 3 .03 en 
Levels of repetitiveness (C) 1 3.61 56 
C X Subjects 3 AS e 
AXB g 1 En 1.67 
A MA peed 1 % <1.00 
ject 3 
lor n pubes A p E 00 
ject: 3 . 
aa c e 1 00 « 1.00 
A X B X C X Subjects 3 .07 


Means*: 


Low constraint 


Low unpleasantness 


High constraint. 


Low constraint. 


> 
High unpleasantness—— — ———— 


High constraint 


r—Low repetitiveness 4.38 
L— High repetitiveness 3.45 
m Low repetitiveness 2.75 
L— High repetitiveness 2.12 
[—Low repetitiveness 2.98 
High repetitiveness 2.06 


[—Low repetitiveness 1.72 


L— High repetitiveness 1.25 


^ Scale equivalents of boredom as in Table 3. 
* p « .05. 
p< 01, 


factor) was repeated five times to assess “bore- 
dom drift.” For only one subject did any drift 
occur and his data were therefore adjusted to an 
arbitrarily set base of “0.” Adjustments assumed 
a linear drift between any two base-line assess- 
ments and that drift and cue effects were simply 
additive, Table 5 gives a four-factor analysis of 
variance, 

It can be seen that these results are congruent 
with those of other phases in regard to con- 
straint. Here, however, we find significant main 
effects for unpleasantness and repetitiveness, 
although the F for constraint is much greater. 
No significant interactions are found. 

Summary, The general conclusions supported 
by the four experiments are these: Reported 
boredom is associated with low arousal, increased 
feelings of unpleasantness, constraint, and re- 
petitiveness. Boredom can be produced or syn- 
thesized by lowering arousal or by increasing one 
of the other three factors. Each variable tends 
to redintegrate a complex of all four which, in 
turn, results in a report of intense boredom. 


Each alone, however, with the others held con- 
stant, can produce boredom, a conclusion un- 
equivocal for lowered arousal and constraint but 
less certain for unpleasantness and repetitiveness. 


Alternative Explanations 


Following the experiment proper, each subject 
was asked to indicate, for a list of 19 “feelings 
or “experiences,” whether he thought an increase 
in that feeling would produce an increase, a de- 
crease, or no change in his experience of bore- 
dom. Our interest centers on 8 of these 19 feel- 
ings, 4 of which were experimental variables and 
4 of which were feelings judged “bad” and “low 
on semantic differential scales in a study by 
Block (1957): grief, guilt, humiliation, and 
worry. Thus an attempt was made to assess the 
influence of “general negativeness" in our experi- 
ment. 

The results show that while subjects saw the 
experimental variables as important in boredom, 
they suggested that the other four variables are 
not. In 12 of the latter 16 cases (4 feelings, 4 


— ^ 
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subjects), subjects reported that an increase in 
the negative feeling would result in either no 
change or perhaps even a decrease in boredom. 
These results indicate that the general factor of 
negativeness is not predominant. 

A second task was designed to test the effect 
of the subjects’ expectations. They were asked 
to rank order the four experimental variables in 
terms of their expected effect on boredom, Three 
of the four rated repetitiveness first and all 
rated low arousal last. Since the empirical re- 


' sults give essentially the opposite ordering, the 


subjects’ expectations could not have been an 
overwhelming factor, 


DISCUSSION 


Several experimental findings deserve further 
research attention. For example, the association 
of boredom with low arousal, as we mentioned in 


the introduction, is by no means generally ac- 


cepted. Subjective repetitiveness as the most 
equivocal factor is surely not in line with com- 
mon interpretations of boredom. The major role 
of constraint, a factor typically ignored in 
scientific discourse, suggests that its absence is a 
serious oversight. 

* In regard to arousal, we might suggest that 
theoretical disagreement is at least partly a se- 
mantic illusion. In this study, we defined and used 
.the construct with major emphasis on its cog- 
nitive aspects, Let us then say that low cognitive 
arousal has been shown to be influential in bore- 
dom. Berlyne (1960), the major theorist holding 
a high arousal position, may well agree with 
these results; the cause of high arousal in his 
system is inhibited cortical activity—low cog- 
nitive arousal? In other words, we may be dis- 
cussing two distinguishable forms of arousal, one 
cortical or cognitive and the other more periph- 
eral, sensorimotor, or organismic. All might agree 
that cortical arousal is low in boredom; the dis- 
pute would center on the second level. 

What can we say about the effects of monoto- 
nous stimulation? The finding that subjective 
repetitiveness is not the most important factor in 
boredom does not invalidate the hypothesis that 
monotony, objectively defined, is very impor- 
tant. Monotony may well have its effects on 
boredom by decreasing cognitive arousal rather 
than by increasing subjective repetitiveness. An- 
other possibility, one more acceptable in the 
light of this experiment, is that the typical effect 
of monotony issues from its ability to induce all 
four factors, as in the natural series. 

Finally, the general empirical finding that no 
one of the experimental variables was necessary 
for boredom, that is, that boredom could be and 
was produced at times with any given factor held 
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constant at a neutral level, is another intriguing 
area for further study. Many theorists are on 
(or over) the verge of suggesting that one or 
another of these factors is the sine qua non of 
boredom; the results here indicate otherwise, In 
addition, the sometimes puzzling effects of bore- 
dom may be explained by the possibility that 
one or more of the factors are absent. For ex- 
ample, numerous studies have shown reported 
boredom without the usual decline in perform- 
ance (Barmack, 1939b; Smith, 1953); perhaps 
this boredom has developed without a decrease in 
arousal, Worker dissatisfaction, another unde- 
sirable consequence of boredom, may be more 
directly related to unpleasant feelings. Thus, 
while we have centered on the determinants, re- 
search on the effects of boredom has also been 
given guidelines. 
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ACCURACY OF EMPATHIC JUDGMENTS OF ACQUAINTANCES 
AND STRANGERS 


RONALD TAFT 
University of Western Australia 


This study investigates the relationship between familiarity with a person and 
ability to make accurate empathic predictions of that person’s behavior. 
Psychology students predicted the responses of 2 fellow students on an adjec- 
tive list using the Q-sort technique; one fellow student was known well to 
the judges and the other only slightly. The judgments of the acquaintances 
were more accurate than those of the strangers, but the latter were better than 
chance. The superiority in accuracy for the acquaintances was not due to the 
effect of assimilative projection and is attributed to stereotype and differential 


accuracy. 


Can one make more accurate personality judg- 
ments of persons whom one knows well than of 
persons whom one barely knows? The answer 
to this question is not as simple as it looks. 
Even though a judge has more information to 


work with when he evaluates a close acquain- 
tance, beyond a certain point more information 
is a handicap and may even interfere with the 
correct use of existing information (Taft, 1959). 
Knowing a person well may lead to so much 
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information about him that the personality 
judgments give too much weight to some data 
and far too little to other, more relevant, data. 

A further possible handicap in judging close 
acquaintances is a bias towards favorable judg- 
ments (see Sarbin, Taft, & Bailey, 1960, p. 212), 
and this could set up complicated interactions 
between accuracy, degree of familiarity, and the 
attractiveness of the object person's personality. 
Such interactions could reduce the contribution 
which familiarity might make to the accuracy 
of judgments. 

Despite the above qualifications to the value 
of acquaintanceship in personality judgments, 
it is still hypothesized that, more often than 
not, personality judgments of acquaintances are 
more accurate than those of nonacquaintances. 
In other words, familiarity with the object 
person is a positive aid to accuracy. 

The study to be reported here tests the effect 


* of familiarity on empathic predictions of the 


object person's self-ratings. The only reported 
empirical study on the relationship between 
familiarity and the accuracy of personality judg- 
ments is the much-quoted one by Ferguson 
(1949). He reported that ratings made of as- 
sistant managers by traveling field representatives 
in an insurance company became more accurate 
—actually, more reliable—as the acquaintance- 
ship of the raters with the manager increased. 
This study is very limited in its scope as well 
as uncertain in its criterion of accuracy, and 
there is a need for replication and extension to 
a more general assessment of personality. This 
present study aims at meeting this need. 


METHOD 
Analysis of the Process of Personality Judgment 


The method used to measure the accuracy of 
judgments in the studies described below was 
that of Q sorts of personality traits. The judge’s 
task was to predict how the object person would 
sort the traits and accuracy was measured by the 
correlation between the judge’s prediction and 
the object person’s actual sorting. Each judge 
made a prediction for a close acquaintance and 
for a comparative stranger and the accuracy 
scores for the two predictions were compared. 

Cronbach (1955) has shown that judgments 
of other people can be analyzed into a number 
of possible sources of variance in accuracy 
scores. The breakdown of the variance can be 
quite complex, but for our present purposes we 
shall mention only four main sources. — É 

1. Accurate level of elevation and dispersion 
of the judgments. Since the Q-sort method holds 
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constant the mean and variance of the judgments 
these variables are controlled in our study and 
need not be considered further. 

2. Accuracy due to assimilative projection, 
that is, the degree to which the judge describes 
the object person as being similar to himself. 
When the judge and the object person are 
actually similar in their responses, assimilative 
projection leads to accuracy. There are two ways 
in which this process could have a differential 
effect on the accuracy of the judgments of ac- 
quaintances and strangers; (a) The judge may 
indulge in more assimilative projection in one 
case than in the other. Sarbin et al. (1960, p. 
213) cite several studies that suggest that this 
greater projection in judging persons whom one 
likes leads to greater accuracy in the judgments. 
(b) The acquaintances are likely to be more 
similar to the judge than are the strangers, and 
thus the judgments of the acquaintances would 
be fortuitously the more accurate due to as- 
similative projection. 

3. Stereotype accuracy may occur if the judge 
has an accurate image of how people of the 
object person's “type” usually behave in the 
situation in question. The type referred to may 
be a very broad category, such as "an adult 
human" or it may be as specific as "an extra- 
verted Australian male student aged 23." This 
ability to attain an accuracy better than chance 
through knowledge of the modal behavior of 
certain types of people probably explains the 
apparent reliability and validity of first impres- 
sions (Allport, 1937). Stereotype accuracy could 
be expected to apply to judgments of strangers 
about whom some minimal information is known, 
but judgments of acquaintances have the added 
advantage of a greater knowledge by the judges 
of the categories to which the object person 
belongs. The more complex the categories to 
which the object person can be “instantiated” 
(Sarbin et al, 1960), the more accurate the 
judgments—provided, as pointed out in the 
introduction, the judge is capable of using this 
extra information. Sarbin et al. argue that all 
judgments are based on stereotypes, that is, 
that the judge proceeds by simultaneous and 
successive “taxonomic sortings" of the object 
into relevant categories with known character- 
istics. When the categories are simple, readily 
observable, or broad, we speak of the “stereo- 
type accuracy” of the judgments and when they 
are complex, covert, or narrow, we are more 
likely to speak of the “differential accuracy.” 


1The term assimilative projection is preferred to 
Cronbach’s term “assumed similarity” which implies 
a deliberate inference on the part of the judge, 
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4. Differential accuracy could be a source of 
greater accuracy in judging acquaintances to the 
degree to which the judge is able to combine 
the additional categories and is aware of the 
implications of these categories in the veridical 
world. The judge may even know accurately 
when to use assimilative projection and when 
not to in judging acquaintances. For the purpose 
of this study, differential accuracy may be taken 
as: total accuracy (stereotype accuracy + accu- 
racy due to assimilative projection). 

To summarize the foregoing, each source of 
accuracy would seem to favor judgments of 
acquaintances over those of strangers. Our experi- 
ment will investigate whether this holds up in 
practice and will attempt to determine the 
sources of accuracy in judging both types of 
object person. 


Procedure 


As part of their regular laboratory exercises, 
members of a senior level class in psychology were 
required to carry out “Q sorts” on a list of traits. 
The list consists of 58 adjectives selected from the 
100-word list developed by Crandall and Bellugi 
(1954) to represent the whole range of personality 
traits on social desirability and word familiarity.? 
The subjects were instructed to rate the adjectives 
on a 7-point scale according «o how “typical” they 
were of the person concerned. The distribution of 
the 7 points was predetermined according to the 
normal curve. 

Inter alia, the judges were required to predict how 
two of their fellow students would sort the adjec- 
tives when rating themselves. One of the students 
was the member of the class whom the judge 
"knows best" and the other was the one whom 
the judge “knows least, or virtually not at all.” The 
judges were advised to label the latter person by 
a description if they did not know his name. 


Pilot Study 


To test whether the judgments of acquaintances 
were more accurate than those of strangers in gross 
terms, a pilot study was conducted on 42 members 
of two psychology classes. The subjects consisted of 
both full and part-time students in the last year 
of their courses, 

The criterion of accuracy, the object person's 
actual Q sort, was correlated with the judge's 
predictions in order to obtain an index of accuracy. 
These product-moment correlations were converted 
into normalized z scores in order to facilitate the 
computations, and the mean accuracy for the judg- 
ments of acquaintances was 43 compared with .31 
for the judgments of strangers. The difference was 
significant at the .01 level (£— 3.6, df=41). 


2The list of adjectives used may be obtained on 
application from the author, 
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Main Experiment 


The subjects consisted of 62 senior undergraduate 
psychology students. 

In Sample I there were 13 males and 10 females, 
and in Sample II, 17 males and 22 females. 

After selecting their two object persons, the judges 
indicated how well they knew them. The points on 
the scale were: 4, very familiar (outside university) ; 
3, quite familiar as fellow student; 2, sporadic class 
contacts; 1, barely know (would recognize outside 
class); 0, never really noticed before (would not 
recognize outside class). 

The mean rating for the “know best” persons was 
3.3, and for the “know least” it was .8, For all but 
four subjects, the difference in the degree of acquain- 
tanceship was 2 or more points, We can conclude, 
therefore, that the two groups of object persons 
fulfill the requirements of the experimental design 
with regard to familiarity; almost all of the ac- 
quaintances were “quite” or “very” familiar and 
almost all of the nonacquaintances were virtual 
strangers. (Two subjects did not distinguish between 
the person whom they “knew best” and the one 
they "knew least,” and their results were dis- 
carded.) 

The procedure in Group II differed from that 
in Group I in that the judges were asked to choose 
the fellow student whom they knew best and the 
one whom they knew least and to rate them on 
degree of familiarity before they were told what 
the further procedure would involve. This control 
was introduced in order to avoid any possible 
effect that knowledge of the nature of the task 
might have on the choice and rating of the two 
object persons. In fact, there were no differences in 
the results for Groups I and II on the measures 
of accuracy and the results were therefore pooled. 


RESULTS 


The accuracy scores for the pooled groups are 
presented in Table 1. The mean score of 52 
for the judgments of acquaintances is signifi- 
cantly higher than the score of .42 for the judg- 
ments of strangers. The latter score is, however, 
also significantly higher than 0 (¢ = 14). 

In order to study the role played by actual 
similarity between judge and object person and 
by assimilative projection, a further analysis was 
made of the correlations between the self-ratings 


TABLE 1 


Mean Accuracy Scores (s TRANSFORMATIONS OF r) 
FOR ACQUAINTANCES AND STRANGERS 


(N = 60) 
Acquaintances} Strangers L 
M .52 42 2.38* 
SD 322 23 


* p = .01, one-tailed. 
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+: TABLE 2 
SCORES ON ACTUAL SIMILARITY AND 
ASSIMILATIVE PROJECTION 
(N = 60) 
equas. Strangers| £ 
Actual similarity between 
judge and object person 
M 32 Spec | 7:18 
SD 19 .20 
Correlation with .38** 35 
accuracy 
Assimilative projection by 
judge 
M 53 A5" | 2.04% 
SD 2. 31 
Correlation with 2 .05 
accuracy 


*p = .05. 
** Significantly greater than zero, p < .01. 


of the judges and the object persons (actual 
similarity) and between the judge’s self-ratings 
and their predictions for the object persons 
(assimilative projection), The results are pre- 
sented in Table 2. As expected, there was some 
degree of actual similarity between the judge 
and the object person and this was significantly 
correlated with total accuracy. Further, the de- 
gree of assimilative projection was indeed greater 
in the judgments of acquaintances than in those 
of strangers, However, this greater assimilative 
projection was not an advantage in attaining 
accuracy, since the actual similarity between 
the judge and his familiar object person did not 
differ from that between the judge and his un- 
familiar one, nor did assimilative projection 
correlate with the total accuracy scores. The 
higher accuracy of judgments of acquaintances 
cannot, therefore, be attributed to an interac- 
tion between actual similarity and assimilative 
“projection. 

A separate analysis was made of similarity in 
sex. Forty-seven of the choices of “know best” 
persons were of the same sex, compared with 
only 26 of the “know least.” This is a signifi- 
cant difference, but it did not affect the accuracy 
of these ratings since there was no relationship 
between similarity of sex and accuracy. To com- 
plete the picture, sex similarity also made no 
difference in the degree of assimilative projec- 
tion. Actual similarity, however, was influential 
—in the wrong direction: there was signifi- 
cantly more similarity between the judges and 
object persons when they were opposite in sex 


Po than when they were the same, especially when 


the object person was a stranger. No attempt 
will be made to explore fully this unexpected 
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finding, but, barring pure chance, the explanation 
must lie in factors that determine the choice 
of the object person. Whatever the cause of 
this similarity to the opposite sex, it did not 
lead to greater accuracy nor even to greater 
assimilative projection. 

This experiment also provided an opportunity 
to measure the judge's consistency in his em- 
pathic performance from one type of object 
person to the other. There was a correlation of 
.47 between the amount of assimilative projec- 
tion in the judgments of acquaintances and the 
judgments of strangers. Thus, assimilative pro- 
jection appears to be a consistent characteristic 
of the judges. The accuracy of the judgments, 
on the other hand, was not at all consistent 
(correlation —.01) and it appears to be specific 
to the situation.® 


Discussion 


Although the expectation that judgments of 
acquaintances would be more accurate than those 
of strangers was confirmed, the judgments of 
strangers were more accurate than chance, and 
in 35% of the cases were even more accurate 
than the judgments of acquaintances. 

Having ruled out assimilative projection as 
a contribution to the accuracy of the ratings, 
we must attribute the accuracy of the judgments 
to both stereotype and differential accuracy. 

As pointed out in the introduction to this 
paper, these two sources of accuracy are dif- 
ficult to distinguish excepting in terms of 
obviousness and grossness of the categories 
used in the inferences. In the case of the judg- 
ments of the stranger object persons, the 
broadest category that would have been available 
was that which was common to all of the object 
persons, namely, “senior student in psychology.” 
An examination of the actual similarity scores 
(Table 2) suggests that the self-ratings by the 
members of the class resemble each other at a 
better than chance level, whether the object 
persons are males or females, acquaintances or 
strangers, Thus, there is a modal response for 
the members of the group, and, provided that 
the judges were aware of this, they could have 
attained some accuracy through the application 
of their stereotype of this modal response. We 
cannot say for sure that this is the mechanism 
that accounted for the stereotype accuracy of 
the judgments, but the correlations between 


8This lack of consistency supports Vernon’s 

(1933) finding that the correlates of ability to judge 
acquaintances and strangers form two separate 
clusters. 
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actual similarity and accuracy are suggestive 
of this. 

Thus, the accuracy of the judgments of 
strangers may be attributable entirely to a 
general stereotype of members of the psychology 
class, plus, probably some additional overtly 
observable information such as the sex, age, and 
the expressive behavior of the object persons. 
The judgments of the acquaintances obviously 
used additional information obtained through 
greater familiarity with them. Barring the un- 
likely possibility that. the judges were familiar 
with how the object person rates himself on per- 
sonality inventories, this added information must 
arise from the use of more differentiated infer- 
ential categories than those used in the stereo- 
typed judgments of the strangers. Our results 
support the conclusion that, on the whole, our 
judges were able to use these more differentiated 
categories in a valid manner. We also infer 
that the advantage of the extra information ob- 
tained through familiarity was not outweighed 
by any bias that might have arisen from the 
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affective relationships between those pairs of 
judge and object person who were fairly close 
friends. 
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A COMPARISON OF GROUP AND INDIVIDUAL PERFORMANCE 
WHERE SUBJECTS HAVE VARYING TENDENCIES 
TO SOLVE PROBLEMS 


MORTON GOLDMAN 


University of Missouri at Kansas City 


This study compares the performance of individuals who previously solved 
problems both correctly and incorrectly with 2-member groups, where both 
group members previously solved problems correctly, both members previously 
solved problems incorrectly (choosing the same or different incorrect answers), 
and where one group member previously solved problems correctly and the 
other previously solved problems incorrectly. The group operated more effec- 
tively than the individual under most conditions, being most effective when 
both group members initially solved the items correctly. Only when both 
group members initially chose the same wrong answer did the group operate 


poorly. 


The question of whether the group does better 


than the individual is one of the classical prob- 
lems of social psychology. In general, most 
studies found that group performance was su- 
perior to that of individuals. Many reasons for 
the better group performance have been offered, 
such as: individuals working in a group will 
tend to correct wrong answers and thus serve 
as a check for each other; when interacting 
with others one tends to prune his thoughts 


more carefully?; a division of labor is sometimes 
possible among group members; there exist 
“pseudo group effects” (Secord & Backman, 
1964, pp. 374-375). Studies dealing with indi- 
vidual performance versus group performance 
have been reviewed by Kelley and Thibaut 


1 Although a recent study by Taylor, Berry, and 
Block (1958) showed that this could serve to the 
disadvantage of the group. 
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E 
(1954), Lorge, Fox, Davitz, and Brenner (1958), 
and Hare (1962). 

It is the aim of the current study to examine 
in greater detail than has been done before 
the solving of problems by groups and individ- 
uals. To be considered are some conditions that 
exist when an individual works alone? and when 
a group consisting of two individuals works to- 
gether to solve problems. 

When an individual works alone on a problem. 
two results may occur: the person can solve 

` the problem correctly, or, the person can solve 
the problem incorrectly (not knowing the an- 
swer is being equated to solving the problem 
incorrectly). However, when two people work 
together on a problem, four conditions can occur: 
both individuals working alone could have solved 
the problem correctly; one individual, working 
alone, could have gotten the problem right and 

. the other individual, working alone, could have 
gotten the problem wrong; both individuals, 
working alone, could have gotten the problem 
wrong in the same way; and both individuals, 
working alone, could have gotten the problem 
wrong in different ways. 

This study proposes to set up the conditions 
spelled out above and to examine the various 
comparisons which are thereby permitted. Thus, 
the current study examines in some detail the 
reasons why groups may or may not be more 
effective than individuals in the solution of prob- 
lems. It could be that improvement of the group 
over individuals only takes place when certain 
specific conditions occur. Perhaps only when 
individuals tend to have different wrong answers 
to a problem will they be more effective working 
together than working as individuals, while if 
there exists a tendency for them to solve the 
problem correctly individually, then no benefit 
is obtained when they work as a group. 


METHOD 

The subjects for this experiment were 68 under- 
graduate students at the University of Missouri at 
Kansas City. Using the regular test instructions, the 
subjects in their instructional classrooms (three 
liberal arts classes) were administered Form D of 
the Wonderlic Intelligence Test. The test was ad- 
ministered without imposing a time limit. However, 
almost all of the subjects completed the test in 25 
minutes. 

? As will become apparent when the conditions of 
the experiment are more fully explained, the indi- 
vidual-working-alone treatment designates individuals 
working independently on problems in a room where 
other individuals are also independently working on 
similar problems. Thus, the individual-working-alone 
implies working independently without a partner 
but does not imply working in isolation. 
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Six weeks after the first test administration, 32 of 
the subjects, approximately half of the original 
sample, were asked to retake the same Wonderlic 
test. The remaining 36 subjects were paired on the 
basis of the first administration so that the following 
conditions were true for each of the pairs: there 
existed three items on which both subjects had ob- 
tained the same wrong answer; there were three 
items on the initial test which both subjects got 
right; there were three items on which both sub- 
jects had obtained different wrong answers; and 
there were three items where one subject had ob- 
tained the wrong answer and the other subject had 
obtained the right answer. Each paired group was 
separately administered the original test with the 
following instructions: 


You will be taking this test today as groups. Work 
on each item with your partner, discuss it freely, 
and decide upon a mutually-agreed-upon right 
answer, Both members of the group must indicate 
on their answer sheet the answer that has been 
mutually accepted as the correct answer. 

You will have enough time to complete the test. 
When you have finished, return the test booklets 
and the answer sheets. 


Thus each paired group worked together as a team, 
discussed each question of the test and recorded the 
mutually-agreed-upon answer. The groups were per- 
mitted to work ‘without any time limit imposed and 
no record of the time rétquired was kept. 

The experimental setup used allows each subject 
to serve as his own control. In the case of the 
pair groups for all the above conditions, each sub- 
ject’s test could be examined to see the degree he 
improved on the second administration, the group 
administration, over the first administration, the 
individual administration. Moreover, the subjects re- 
taking of the test as a group member could be 
compared to the subjects retaking the test as indi- 
viduals.’ 

RESULTS 


As described above, two subjects were placed 
in every group according to the initial test ad- 
ministration so that there existed: three items 
which both subjects had correct, three items 
on which both subjects had the same wrong 
answer, three items on which both subjects had 
different wrong answers, and three items which 


3 It should be noted that the experimental design 
used made the assumption that if on the first ad- 
ministration of the test a subject got an item correct, 
then he was more apt to get this same item correct 
on the second administration of the test. Similarly, 
if on the first administration of the test a subject 
got an item incorrect, then he was more apt to get 
this same item incorrect on the second administra- 
tion of the test. The means presented in Table 1 for 
the individuals who initially solved the problems cor- 
rectly and for the individuals who initially solved 
the problems incorrectly gives support to this as- 


sumption. 
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TABLE 1 


MEANS OBTAINED on SEcoNp TEST ADMINISTRATION 
FOR THE Six Sets or ITEMS 


Sets based on subjects. (36) working in paired groups 


Both subjects initially wrong 2.89 

oth subjects initially same wrong -78 
Both subjects initially different wrong 1.56 
One subject Tight, one subject wrong (mixed) 2.39 

Sets on subjects (32) working as individuals 

Initially wrong 1.41 
Initially right 2.53 
Combined 1.97 


one of the individuals in 
and the other had wrong. 


subjects working as individuals who initially had 
the items Correct (¢ = 2,59, b «.02).* Thus there 
appeared to be a group effect for subjects who 
previously Solved problems correctly, the group 
i the individual, 

by the paired groups where 
both subjects initially had the items incorrect, 
but had Chosen different wrong responses, was 


* All probabilities stated in this Study were based 
on two-tailed tests, 


uals who initially had the items incorrect was 
significantly greater than the mean obtained by 
the paired groups where both subjects initially 
had the items incorrect by choosing the same 


wrong response (t=3.90, p< .01). The group 
effect was negative 
viously got a problem wrong for the same rea- 
Son. But even when the two group members 
previously got the problem wrong by choosing 
different wrong responses they did not score 
higher (also, not lower) than an individual sub- 
ject working alone who originally got the problem 
wrong. 

The mean obtained by the paired groups 
Where, for each of the three items, one of the 
group members initially had the item correct 
and the second group member initially had the 
item incorrect (mixed-pair group) was not sig- 
nificantly different from the mean obtained for 
the subjects Working as individuals who initially 
had the items Correct. The mean of the mixed- 
Pair groups was significantly greater than the 
mean of the subjects working as individuals who 
initially had the problem wrong (¢ = 4.90, p< 
i member of a paired group 
was apt to solve a problem the group does not 
do worse (also not better) than an individual 
working alone who Was apt to get the problem 
correct. A pair group where one of the members 
Previously got the Problem right did do better 
than an individual working alone who previously 
got the problem wrong. 


Thus, a paired group, 
had a tendency to 
get a problem Tight and the second group mem- 

r had a tendency to get the problem wrong 
did better than the Combined score of two in- 
dividuals Working independently where one in- 
dividual initially had the Problem right and the 
other initially had the Problem wrong. The 


| 
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Stoup effect operated most “effectively (posi. 
tively) when both members of a Paired group 
initially solved à problem correctly,5 


Discussion 


each have the tendency to get them wrong, 
choosing the same incorrect response, then the 


Sponse. Thus where problems are to be solved 
Which for a group of individuals have an item 
difficulty index of less than .50, the results 
'obtained in this study would predict that paired 
groups would perform better than individuals 
Working alone, If the problems to be solved 
"re very difficult for a group of individuals and 
that the same wrong 
then pair groups com- 
will do more poorly 
than if the Subjects were to work alone as 


The study also shows that when a pair group 
consists of a member who has a tendency to 
get a problem right and a member who has 
the tendency to get a problem wrong, that pair 

working 
alone. Thus, for problems which have an item 
difficulty of approximately .50, pair groups would 
solve them better than individuals working alone. 
This result would be further enhanced if the 


.more capable member of the pair group was 


"In each of the pair groups, the total score on 
the first test for each group member was known, 
Although the current study was not designed to 
answer this question, the mixed-pair groups where 
the higher total initially scoring group member got 
the right answer on one of the three items examined 
in the second test administration could be compared 
with the mixed-pair groups where the lower total 
initially scoring group member got the right answer 
9n one of the three items, A chi-square to test for 
this comparison was Significant at the 10 level, The 
Tesults are only suggestive. More definitive results 
might be obtained in an experiment designed for the 
study of the issue raised. - 
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RELATIONSHIP BETWEEN PERCEPTUAL DEFENSE AND 
EXPOSURE DURATION * 


MICHAEL J. GOLDSTEIN 
University of California, Los Angeles 


A perceptual defense study was done in which recognition guesses were ob- 
tained in a random series over 8 exposure levels yielding a range from chance 
to 10076 recognition level. Ss received 336 exposure trials over 3 sessions to 
obtain stable functions. A shift from defense to vigilance was found between 
Day 1 and Day 2 for all Ss. The response tendency noted at the lowest ex- 
posure levels (vigilance or avoidance) was found to be relatively constant over 
the exposure range failing to confirm the recent findings of Dorfman, Gross- 


berg, and Kroeker (1965). 


In a previous paper the writer suggested 
(Goldstein, 1964) that the probabilities of using 
threat and nonthreat words in a perceptual recog- 
nition situation remain relatively invariant as 
stimulus information is increased. Recently 
(Dorfman, Grossberg, & Kroeker, 1965) have 
presented evidence which suggests that this as- 
sumption may be invalid and that, in fact, a 
significant shift from avoidance to vigilance 
Occurs as exposure duration is increased. Un- 
fortunately, this study used the taboo word para- 
digm in which a fixed group of socially unac- 
ceptable words were used as stimuli for all 
subjects, Most of the present writer’s research 
has been done on a population of words of 
minimal social unacceptability which are selected 
individually for threat potential on the basis of 
prior word association performance. The pur- 
pose of the present report is to present data 
relating differential accuracy for threat and non- 
threat stimuli to exposure duration within the 
latter paradigm. 


METHOD 


Subjects were 10 undergraduates, 5 males, 5 
females, who volunteered to satisfy a requirement 
from the introductory psychology course at UCLA. 
Subjects were seen over three experimental sessions, 
which were approximately 48 hours apart. In the 
first session, subjects received the word association 
test and criteria previously described (Goldstein, 
1962) were used to select three threat and three 
neutral words. These words were typed on sheets of 
bond paper and the subject was asked to learn the 
list to a criterion of two correct trials prior to the 
perceptual recognition task. 

Following this, two blocks of trials were run and 
the next experimental session was scheduled. Stim- 


1 Supported by NIMH Grant MH04720 and 08744. 
The author would like to express his appreciation to 
Samuel Himmelfarb and Paul Lewinson for their 
help with this experiment. 


uli were flashed tachistoscopically using a modified 
NYU tachistoscope (Kaswan & Young, 1963). Eight 
different exposure times were used which had been 
found in pilot work to lead to accuracy scores rang- 
ing from chance to 100% accuracy with stimuli of 
this sort. The six words were presented at each of the 
eight exposure times according to a random order 
which randomized word presentation and exposure 
duration within each block of 48 exposure trials. 
Thus, within a block of 48 trials, the three threat 
words and the three neutral words were each pre- 
sented once at each exposure duration. Seven such 
blocks of 48 trials were run during the experiment; 
two on Day 1, three on Day 2, and two on Day 3. 
In order to separate a blocks effect from the par- 
ticular random order used to present stimuli, five 
different orders of seven blocks were constructed 
using different patterns of random order sequences. 
Thus a subject might receive Random Order 1 as his 
first block, Random Order 2 as his second block, 
etc. Another subject might receive Random Order 3 
as his first block, Random Order 1 as his second, 
etc. The design of the experiment balanced these 
orders so that statistical analysis of sequence effects 
could be done. Over the three sessions, each subject 
had 336 exposure trials which provided stable re- 
sponse functions for each subject. 

Each subject was instructed to respond to each 
exposure flash with one of the six words on his list. 
The list was attached to the tachistoscope adjacent 
to the eyepiece. 


RESULTS 


Two dimensions of response were of interest, 
the relative accuracy for threat and nonthreat 
words as a function of exposure duration and 
the probability of using threat words as guesses, 
regardless of accuracy, as a function of exposure 
time. Initially, analysis of variance was done on 
these data, using the results for all three sessions. 
For the accuracy data, exposure time was highly 
significant (F=110, p< .0001). Blocks was 
significant (F = 6.65, p< .01), indicating that 
subjects became more accurate as the experiment 


608 


b 


» 
g 


RB 


Ru 


BRIEF ARTICLES 


progressed. The interaction between blocks and 
type of word (threat versus neutral) was also 
significant (F = 2.67, p< .05) which indicated 
that a rather dramatic shift in differential accu- 
racy occurred between the first and second ex- 
perimental session, In the first session most sub- 
jects showed perceptual defense while by the 
second session the main pattern was perceptual 
vigilance which continued into the third session. 
The interaction of word type and exposure time 
was nonsignificant (F < 1) indicating no change 


‘in relative accuracy for threat and nonthreat 


words with increasing exposure time. Thus, we 
were unable to find the type of shift in accuracy 
reported by Dorfman et al. (1965). 

However, it is possible that the shift from de- 
fense to vigilance from the first to second ses- 
sions may mask interactions with exposure time 
present in the data. Therefore, a separate analy- 
sis was done for each experimental session. The 
accuracy scores for threat and nonthreat words 
were averaged for each session to reveal the 
average number of words correctly recognized at 
each exposure level. The results of this session 
by session analysis are presented in Figure 1. 
It can be seen that in Session 1 perceptual de- 
fense was the modal pattern but that except for 
ne reversal at Exposure Level 3, the pattern 
was relatively constant over the range of ex- 
posure times. The data for Session 2 indicate a 
fnodal pattern of perceptual vigilance which also 
is relatively invariant over the exposure range. 
The data for Day 3 show no conclusive trend 
one way or another. When statistical analysis 
was done on each day's data separately, not 
one of the interactions between type of word 
and exposure time approached significance (all 
Fs <1). 

Another approach to this analysis is to con- 
sider the probability of emitting threat words, 
independent of their accuracy. If an interaction 
with stimulus information is present, it should 
affect these response probabilties in the direction 
of increasing usage of threat words, Therefore 
the average frequency of usage of threat words 
for each exposure level was computed and trans- 


"formed into a percentage of possible calls at 


A 


fi" words. 


* CORRECT 


ESESSEZE 
DAY 3 


ESSESEZE 
DAY 2 


Tis 
DAY! 


Fic. 1. Percentage of correct identification for 


emotional (E) and neutral words (N) at eight 
exposure durations on 3 experimental days. 


Bere 


that level. It should be pointed out that with 
increasing accuracy there would be a tendency 
toward a 50% usage rate since as subjects ap- 
proach 100% accuracy they are doing so by 
using the classes of threat and neutral words 
with equal frequency. These percentages of usage 
are presented in Table 1, broken down by experi- 
mental day and summed over all three days. 
Analysis of variance of the summed data failed 
to reveal a significant change in response prob- 
ability with increasing exposure duration (F = 
1.05, ns). Thus, over all three sessions the prob- 
abilities of using threat words as recognition 
guesses remain invariant with increasing exposure 
duration. Once again a significant blocks effect 
was found (F=6.21, p<.01) which is indi- 
cated in Table 1 by an increasing rate of using 
threat words from Session 1 to Session 3, How- 
ever, since these systematic shifts were present, 
it was felt necessary to analyze each day sepa- 
rately for exposure time shifts. When this was 
done no significant Fs were found for exposure 
time on any day. There is a trend in the data 
for the usage rates at the lowest exposure level 
to be the lowest in a series, but there is no 
consistent upward trend from that point. On 
Day 1, for example, in which the largest dif- 
ferences in accuracy between threat and non- 
threat words is present, the probabilities indicate 
less usage for threat words across the exposure 
range without a single reversal. 

It can be argued that analyses of group data 
may not be sufficiently sensitive to exposure 


TABLE 1 
PERCENTAGE OF POSSIBLE EXPERIMENTER Worp CALLS over EIGHT EXPOSURE LEVELS 
Day 1 2 3 5 6 1 8 
47.5 40.8 40.0 44.2 45.8 48.3 
2 483 43 511 500 51.1 53.3 51.1 50.6 
3 43.3 60.8 52.5 57.5 58.3 52.5 51.7 50.0 
All days 44.8 50.7 50.5 49.5 50.0 50.5 49.8 49.8 
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time shifts or may not pick up cancellation of 
trends in the data (e.g., initial sensitization shift- 
ing to avoidance canceling out the opposite 
trend). Therefore, using the most clear-cut data 
from Day 1 individual comparisons were done 
contrasting a subject's trend for the two shortest 
exposure durations with his trend for the two 
longest exposure durations. Subjects were clas- 
sified as showing greater or less accuracy for 
threat words as compared to their neutral word 
score for the two duration levels. Of the subjects 
who showed less accuracy for threat words (per- 
ceptual defense) at the faster exposure durations, 
6 showed the same trend at the longest exposure 
level, while 2 showed a reversal. Of the 2 who 
showed greater accuracy for threat words at the 
shortest exposure level (perceptual vigilance), 
both showed vigilance at the longest exposure 
levels. Thus, 8 out of 10 subjects were consistent 
at the extreme exposure levels, a finding which 
does not suggest that the group trends are mis- 
representing individual functions. 


Discussion 


We have completely failed to confirm the 
finding of Dorfman et al. (1965) and have in- 
stead found considerable stability for differential 
accuracy and response probability across the 
exposure duration range. How can we account 
for such different findings? One possibility lies 
in the nature of the stimulus materials, Possibly, 
when words of high social unacceptability are 
used as stimuli different response patterns are 
indeed present when the exposure duration is 
varied. It could be that on trials in which there 
is low informational yield, subjects are very 
cautious about using taboo words and avoid them 
causing a defense effect. However, at higher 
exposure levels where the informational yield is 
greater this caution may be felt less necessary and a 
compensatory overusage may be present. These 
decision processes may be less appropriate to 
situations in which stimulus materials do not 
involve the verbalization of taboo words, as in 
the present study. 

Another focus for reconciling the disparate 
results may be the different task demands in the 
two experiments. In the present experiment sub- 
jects had to select one word from a set of six 
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words which were designed to have a high con- 


fusability level. The confusability was insured " 
by using three sets of words, one threat, one | 


neutral in each set each matched for frequency 
of usage, length, and starting letter. Thus, even 
at the higher exposure levels, subjects had to 


discriminate more than the initial letter to make ' 


à correct identification. In the Dorfman et al, 
study, binary judgments were required between 
a set of two stimuli which were low in confusa- 
bility. Thus, in order to make a correct identifica- 
tion all that was needed was detection of the 
starting letter of a word (either P or w was 
sufficient information to decide the binary choice, 
PENIS-MIXER). Thus at the lower exposure levels 
where this information may be unavailable, sub- 
jects may show a response bias against using 
taboo words which results in perceptual defense. 
However, once the exposure is reached at which 
this initial letter discrimination is possible, there 
is no longer ambiguity about which stimulus is 
present. This low ambiguity factor may account 
for the shapes of the curves reported in the 
Dorfman et al. paper in which there is a major 
shift in accuracy between 30 and 60 milliseconds 
but very slight change after that level. Once the 
first letter is discriminated all other information 
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is redundant. This argument explains why the ` 


perceptual defense effect would disappear rather 
early in the exposure series; it still does not 
account for the crossover noted by Dorfman 
et al. 
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AN EXPERIMENTAL TEST OF THE SEQUENTIALITY 
OF DEVELOPMENTAL STAGES IN THE CHILD'S 
MORAL JUDGMENTS* 


ELLIOT TURIEL? 
Yale University 


2 developmental propositions of Kohlberg's theory of 
the stages form an invariant sequence, and, thus, 


tested: (a) that 


moral judgments were 
more learn- 


ing results from exposure to the stage directly above one's level than to stages 
further above; (b) that passage from 1 stage to the next involves integra- 


tion of the previous stages, and, 


thus, more learning 


results from exposure to 


the stage directly above than to the stage 1 below. First, Ss’ stages were 
determined in a pretest. 44 Ss of Kohlberg's Stages 2, 3, and 4 were equally 


distributed among 3 experimental 


groups and 1 control group. In the treat- 


ment conditions, Ss were exposed to either the stage 1 below, 1 above, or 2 


above the initial dominant stage. 
treatment condition. In a posttest 
was assessed, The results confirmed 


The control group 
the influence of the treatment conditions 
the hypotheses since exposure to the stage 


was not administered a 


directly above was the most effective treatment. 


Moral development has been approached 
from different viewpoints. Developmental 
theories such as Piaget’s (1948) focus on 
the cognitive processes underlying moral re- 
sponses and assume that the organization of 
these processes is different at different stages 
of development. The greater part of develop- 
mental research on morality has stemmed 
from Piaget’s theory of moral stages, stages 
supported only to a limited extent by subse- 
quent investigations (see Kohlberg, 1963b). 
Kohlberg (1958, 1963a) has postulated the 
following set of moral stages, which are based 
on children's reasoning in response to hypo- 
thetical moral conflicts (Kohlberg, 19632): 


Stage 1: Punishment and obedience orientation. 

Stage 2: Naive instrumental hedonism. 

Stage 3: Good-boy morality of maintaining good 
relations, approval of others. 

Stage 4: Authority-maintaining morality. 


1This study is based on a dissertation presented 
to Yale University in candidacy for the degree of 
Doctor of Philosophy. It was conducted while the 
author held a United States Public Health Service 
predoctoral fellowship. The author wishes to express 
his gratitude to the members of the dissertation 
committee: Edward Zigler, Irvin Child, Merrill Carl- 
smith, and Robert Abelson. The author is also 
indebted to Lawrence Kohlberg for his invaluable 
advice and aid. Thanks are due to Rita Senf for 
her critical reading of the manuscript. 

2Now at Bank Street College of Education and 
the Center for Urban Education, 33 West 42nd 
Street, New York, New York. 


Stage 5: Morality of contract and democratically 


accepted law. 
Stage 6: Morality of individual principles of con- 
science [pp. 13-14]. 


While space does not permit a detailed 
definition of Koblberg’s stages nor of his 
methods for the elicitation and stage clas- 
sification of responses, the Method section 
should clarify the nature of his data. 

Kohlberg postulated that his stages define 
a sequence normally followed by each indi- 
vidual. The sequence of the stages is hypothe- 
sized to be invariant, with the attainment 
of a mode of thought dependent upon the 
attainment of the preceding mode, requiring 
a reorganization of the preceding modes of 
thought. Evidence for this hypothesis (Kohl- 
berg, 1963a, pp. 15-17) consists, first, of 
findings of age differences, in various cultures, 
consistent with the notion of sequence and, 
second, of findings of a “Guttman quasi- 
simplex” pattern in the correlations between 
the various types of thought, à pattern 
expected if they form a developmental order. 

While this evidence supports the validity 
of the stages as forming a fixed sequence, 
there has been no experimental evidence. The 
aim of the present research was to subject 
Kohlberg’s hypotheses to an experimental test. 
In particular, the concept of developmental 
sequence suggests some hypotheses regarding 
developmental change and learning of new 
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moral concepts. The plan of the study was 
to select subjects at varying developmental 
stages, expose them to moral reasoning that 
differed from their dominant stage, and then 
test the amount of learning and generaliza- 
tion of the new concepts. First, part of the 
Kohlberg moral judgment interview was ad- 
ministered to determine the subject's domi- 
nant stage. With the remaining part of the 
Kohlberg interview, the subject was then ex- 
posed to concepts corresponding to a stage 
differing from his own. Some subjects were 
exposed to the stage that was one below their 
own, some to the stage one above, and some 
to the stage two above. Finally the subject 
was retested on the entire interview. If Kohl- 
berg's stages do form a fixed developmental 
Sequence, so that the attainment of a mode 
of thought is dependent on the attainment 
of the preceding mode, then it is expected 
that subjects exposed to the stage directly 
above their dominant stage would show more 
usage of that stage on the retest than would 
subjects exposed to stages two above or one 
below. : 

Thus this study was designed to test the 
following two hypotheses: 

1. That Kohlberg's stages form an invari- 
ant sequence so that an individual's existing 
mode of thought determines which new con- 
cepts he can learn. It was expected that sub- 
jects exposed to reasoning corresponding to 
a stage directly above their dominant stage 
would be influenced more than those exposed 
to reasoning corresponding to a stage further 
above. 

, 2. That each stage represents a reorganiza- 
tion of the preceding stages, and in effect is 
a displacement of those stages. If each stage 
Is a reorganization of the preceding stages, 
rather than an addition to them, then a 
tendency to reject lower stages would be ex- 
pected, so. that subjects exposed to a stage 
one above would be influenced more than 
those exposed to a stage one below their own. 


MetHop 
Subjects 


This experiment used 44 seventh-grade boys from 
the New Haven public schools, between the ages 
of 12-0 and 13-7. These boys, chosen at random 
from the school files, were from the middle socio- 
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economic class, as determined by their parents’ 
occupation and education level 


Scoring Methods 


An individual’s developmental stage is determined 
by using Kohlberg’s (1958) moral judgment inter- 
view, which contains nine hypothetical conflict 
stories and corresponding sets of probing questions. 
The following story is an example: 


In Europe, a woman was near death from a 
special kind of cancer. There was one drug that 
the doctors thought might save her. It was a 
form of radium that a druggist in the same town 
had recently discovered. The drug was expensive 
to make, but the druggist was charging ten times 
what the drug cost him to make. He paid $200 
for the radium and charged $2,000 for a small 
dose of the drug. The sick woman’s husband, 
Heinz, went to everyone he knew to borrow the 
money, but he could only get together about 


$1,000, which is half of what it cost. He told the, , 


druggist that his wife was dying and asked him 
to sell cheaper or let him pay later. But the 
druggist said: “No, I discovered the drug and I'm 
going to make money from it" So Heinz got 
desperate and broke into the man's store to steal 
the drug for his wife. Should the husband have 
done that? 3 


Two scoring procedures are available for deter- 
mining a subject's scores on each of the six stages. 
(The stage with the highest score represents his 
dominant stage.) The first; a more global method, 
involves the use of rating forms devised by Kohlberg 
(1958). A second scoring procedure uses detailed 
coding forms (Kohlberg, 1958) for each of the nine 
situations of the interview. These coding forms were 
constructed and standardized on the basis of re- 
sponses given by a large number of subjects. Each 
response listed in the coding forms has a stage 
assigned to it. A subject's responses to a given situa- 
tion are divided into "thought-content" units, and 
each unit is assigned to a stage, as determined by 
the stage classification of that unit in the coding 
form. In this way the total number of units assigned 
to each stage is determined. 


Design and Procedure 


There were three steps in the experimental pro- 
cedure. The subject's dominant stage was determined 
by a pretest interview. In the experimental session 
subjects were exposed, through role playing, to con- 
cepts that were either one below, one above, or two 
above their initial dominant stages. These experi- 
mental treatments will be referred to as —1, +1, and 
+2 treatments, respectively. The treatment groups 
were equated on IQ via the Ammons Full-Scale 
Picture Vocabulary Test. In a posttest interview the 
subjects! stage scores were reassessed to determine 
the influence of the treatment. 

Pretest selection interview. During the first meet- 
ing each subject was individually administered six 
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of the nine situations of the Kohlberg interview in 
order to determine his initial stage scores. A tentative 
assessment of each subject’s scores was made using 
Kohlberg’s global rating forms. Only those subjects 
whose scores on the dominant stage were twice as 
large as their scores on the next most dominant stage 
were retained. In all, 21 subjects were discarded, 
while the 48 retained were equally distributed among 
Kohlberg’s Stages 2, 3, and 4. 

Since the global rating system did not provide the 
sensitivity desired for the experiment, the protocols 
of subjects retained were rescored using Kohlberg's 


* detailed coding forms. Only those subjects who then 


scored higher on their dominant stage, as determined 
by the global ratings, than on any other stage were 
retained. Four subjects were thus discarded, leaving 
atotal of 44. 

Experimental treatment conditions. All subjects of 


‘a given dominant stage were randomly assigned to 
- the control group or to three experimental groups 


(N — 11 per group). In the experimental treatments, 


. administered 2 weeks after the pretest, subjects were 


exposed to moral reasoning in individual role- 
playing situations with an adult experimenter. In one 
treatment the reasoning presented was one stage 
below the initial dominant stage (—1 treatment) ; 
the second treatment group was exposed to reasoning 
that was one stage above (+1 treatment); and in 
a third treatment the reasoning presented was two 
stages above (+2 treatment). Members of the con- 
trol group were not seen by the experimenter for 
any kind of treatment. 

Through role playing of the three remaining stories 
df the Kohlberg interview, experimental subjects 
were exposed to the new moral concepts, After each 
story was read the subject played the role of the 
main character in the story, and as the main char- 
acter he was to seek advice about the problem from 
two friends, The experimenter played the parts of 
the two friends. The subject first asked one “friend” 
for “advice,” with that friend’s advice favoring one 
side of the conflict, and then asked the second 
friend, who favored the other side of the conflict. 
The reasoning was always at the stage appropriate 
to the subject’s treatment condition. All the argu- 


“ments used in the role playing were constructed by 


closely following the coding forms and thus are 
based on specific coded responses. 

Illustrative examples of the treatment-condition 
arguments are based on the Kohlberg situation in 
which the husband’s conflict is between stealing a 
drug or letting his wife die. The following two argu- 
ments, containing Stage 3 reasoning, represent what 
a Stage 2 subject in the +1 treatment was exposed 
to in this situation: 


(a) You really shouldn’t steal the drug. There 
must be some better way of getting it. You could 
get help from someone. Or else you could talk the 
druggist into letting you pay later. The druggist 
is trying to support his family; so he should get 
some profit from his business. Maybe the druggist 
should sell it for less, but still you shouldn’t just 
steal it. 


(b) You should steal the drug in this case. 
Stealing isn't good, but you can't be blamed for 
doing it. You love your wife and are trying to 
save her life, Nobody would blame you for doing 
it. The person who should really be blamed is the 
druggist who was just being mean and greedy. 


The experimenter, while administering the treat- 
ment, did not know the subject's stage, since he had 
not scored the pretest, and did not know the experi- 
mental group of the subject; hence administration 
of the treatments was blind. The only exceptions 
were subjects exposed to “Stage 1" concepts who 
must have been in Stage 2, and those exposed to 
"Stage 6," who must have been Stage 4 subjects. 
The possibility of the experimenter recalling the sub- 
jects’ stages since he previously interviewed them is 
unlikely because there were many lengthy interviews, 
and because a subjects stage is determined using 
the scoring guides. 

Posttest interview. The posttest consisted of the 
six pretest situations plus the three situations of the 
experimental treatments; it was administered to the 
experimental subjects 1 week after the treatment, 
and to the control subjects 3 weeks after the pre- 
test. Subjects were called to the experimental room 
individually, where they were told they would be 
asked questions regarding stories similar to the ones 
they had previously heard. (The repetition of some 
of the stories and questions did not seem to affect 
the subjects’ willingnesseto respond. They generally 
responded with the same interest and concentration 
as in the pretest.) 


Reliability and Scoring of Protocols 


The results reported in this paper are based on 
the scores obtained through the detailed coding. The 
interviews were coded by the experimenter more 
than a year after their administration. The scorer 
had no knowledge of the identity of the protocol 
he was coding, nor of its experimental condition. 
All the pretests were scored separately from the 
posttests. The coding was carried out on a situation- 
by-situation basis rather than on a subject-by- 
subject basis; after all subjects’ responses to the 
first situation were coded, all subjects’ responses to 
the next situation were coded, and so on. 

One estimate of the reliability of Kohlberg's de- 
tailed coding system is based on the independent 
coding by two judges of responses obtained from 
17 subjects not used in this experiment. Scores for 
each subject consisting of the percentage of the 
statements falling into each of the six stages were 
calculated. A weighted score per subject was then 
obtained by multiplying the number of points at 
each stage by the number of the stage, summing 
these products, and dividing the sum by the total 
number of points. The product-moment correlation 
between the scores of the two judges was .94. 

A measure of interjudge agreement on the scoring 
of the subjects in this experiment was obtained 
from the correlation between the scores of the 
author, who used the detailed coding system, and 
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those of another scorer who used the global rating 
system, Under both scoring systems a subject re- 
ceives a number of points on each stage, which can 
be converted into a single score by the procedure 
described above. A product-moment correlation of 
.78 was found for the original 48 subjects. Since the 
two scoring systems differ slightly, this correlation 
is a conservative estimate of the interjudge reliability 
of the detailed coding system. 


RESULTS 


The analysis of the posttest interview, 
which included all nine moral judgment 
situations, was divided into the following two 
parts: 

1, Stage scores were obtained from the 
posttest responses to the three situations used 
in the treatments and not in the pretest. Since 
the experimental subjects were directly influ- 
enced on those three situations, these scores, 
which will be referred to as “direct scores,” 
represent the amount of direct influence of 
the treatment. 

2. Posttest stage scores for the six situa- 
tions used in the pretest represent the amount 
of indirect influence, or-the tendency to gen- 
eralize the treatment influence to situations 
differing from those on which subjects were 
directly influenced. The measure reflecting in- 
direct influence is the difference between a 
subject’s pretest and posttest scores on each 
stage. These change scores will be referred to 
as “indirect scores.” 


TABLE 1 


MEAN Direct POSTTEST STAGE SCORES (IN Proror- 
TIONS) ON THE STAGES ONE BELOW (—1), THE 
Same As (0), ONE Agove (+1), AND 
Two ABOVE (+2) THE PRETEST 
DOMINANT STAGE 


ER Pcr Condition groups 
dominant i 
stage" | treatment | treatment | treatizent | Control 
-1 33611 -183 .209. 
9 ms | ius” | 34. | “Jos 
Ti A315 26622 14523 12224 
+2 .057 .102 .099 -085 


Note.—Dunnett £ 

es dU each e figu. 
ests significant at the .05 level, Gi 1 

level, 11 > 12; at the .005 level, 22 2 21, 22 S 13 2i sug 
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Direct Scores 


The analysis of the direct scores involved 
the percentage of usage for each subject of 
the stage that is: one below the initial domi- 
nant stage (—1 scores), at the same stage 
as the initial dominant stage (0 scores), one 
above the initial dominant stage (+1 scores), 
and two above the initial dominant stage 
(+2 scores)? 

The hypothesis was that an individual ac- 
cepts concepts one stage above his own domi- 
nant position more readily than he accepts 
those two stages above, or those one stage 
below. Two specific hypotheses result from 
this general hypothesis that the +1 treatment 
would be the most effective: (a) that the +1 
treatment causes more movement to +1 than 


the +2 treatment causes movement to +2. 


or the —1 treatment to —1, and (5) that the 
+1 treatment causes more +1 movement 
than does any other treatment. 

Test of Hypothesis a. Table 1 presents 
(in boldface type), for each experimental 
group, the mean amount of usage of concepts 
at the same stage as that of the treatment 
condition, Table 1 also presents the control 
group mean scores on the stages that are one 
below (—1 scores), one above (--1 scores), 
and two above (--2 scores) their dominant 
stage. 

The experimental groups’ scores may not 
reflect solely the influence of the experimental 
manipulations. To determine how much of 
these scores reflects factors other than the 
treatments, it is necessary to correct for the 
change that would have occurred independ- 


ently of the experimental manipulations. The' 


best estimate of this change is provided by 
the control group, which had no treatment. 
It may be assumed that the scores of the 
control group are due to statistical regression 
and other artifactual sources.* 


3 The other scores, such as those of the stage 
two below or three above the dominant stage, are 
not reported because they did not show significant 
differences between the groups and do not add to 
the understanding of the problem. 

1]t may be a function of skewness that the —1 
score of the control group was considerably larger 
than the +1 or +2 scores, Of a subject's series of 
scores one stage has the largest score while its 
adjacent stages have the next largest scores, With 
the more distant stages to the dominant stage having 
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,  . The experimental groups’ scores were cor- 
». rected by subtracting from those scores the 
corresponding control group scores. This sub- 
traction was done in the following manner: 
The — 1 mean of the control group was sub- 
tracted from the —1 mean of the —1 treat- 
ment group; the +1 mean of the control 


E. group was subtracted from the +1 mean of 
S the +1 treatment group; the +2 mean of 
|. the control group was subtracted from the 


+2 mean of the +2 treatment group. The 
i three corrected means (—1 = .096, +1 = 
.144, +2 = .014) obtained in this way are 
presumably free of artifacts and thus repre- 
sent the amount of influence of the experi- 

j mental treatments. 
k The corrected means show that, as hy- 
* _pothesized, the direct influence of the +1 
treatment was greater than that of the other 
two treatments. The corrected mean of the 
+1 treatment group was shown to be signifi- 
cantly greater than the corrected mean of the 
+2 treatment group by a one-tailed 7 test 
-(t = 3.55, p < .005).5 The one-tailed ż test 
*of the difference between the corrected means 


smaller scores. The subjects of this experiment tended 

.to use the stages below the dominant stage more 
than those above, resulting in a positively skewed 
distribution on the six situations of the pretest. 
When the other three situations are included, more 
usage of the stage directly below the dominant stage, 
resulting in less skewness, would be expected. 

The control group and the experimental groups 
were originally very similar. There were no sig- 
nificant differences between the combined scores of 
the experimental groups and the scores of the control 
group, with the ¢ values ranging from .10 to .65. 
-l We also compared each experimental group with the 
* control group and found no significant differences. 

5Having subtracted the appropriate control score 
from the experimental condition score we then com- 
puted a £ test for the difference between the corrected 
means. The standard error for this ¢ test is complicated 
, by the fact that we subtracted correlated groups from 
independent groups. However, the appropriate 
standard error may be shown to be: 


Vs F s — reyeqSi52 


where: 


s? = the MS, for the +1 scores multiplied by 2/n 
s2 = the MS, for the +2 scores multiplied by 2/n 
n = the number of subjects in each group 
Teyeg = the correlation between the +1 and +2 scores 
of the control group. 
(We are indebted to Robert Abelson and Merrill 
Carlsmith for the derivation of this expression.) 
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of the 4-1 treatment group and the —1 treat- 
ment group reached a borderline level of 
significance (£ = 1.43, p < .10). 

The corrected mean of the —1 treatment 
group was significantly greater than the cor- 
rected mean of the +2 treatment group 
(t = 2.03, p < .05). 

Test of Hypothesis b. We have demon- 
strated that the amount of usage of the treat- 
ment condition stage was greater in the +1 
treatment group than in the other two experi- 
mental groups. While this result is necessary 
to demonstrate the greater influence of the 
+1 treatment, the +1 scores of the +1 treat- 
ment group must also be compared with the 
+1 scores of the other groups. 

Table 1 contains the +1 scores of each of 
the four groups. The differences between the 
+1 score of the +1 treatment group and the 
+1 scores of the other groups were tested 
using Dunnett’s ¢ statistic, which is appropri- 
ate in simultaneously testing one group mean 
against each of several others (Winer, 1962). 
These £ tests indicated that the +1 treatment 
was the most effective condition in moving 
subjects up one stage, since the +1 score of 
the +1 treatment group was significantly 
larger than the +1 scores of any other group 
(Table 1). 

Other findings. Table 1 also presents the 
—1, 0, and +2 scores for the four groups. 
The —1 score of the —1 treatment group was 
larger than the —1 scores of the other groups. 
However, the Dunnett £ test indicates that 
the difference between the —1 score of the 
—1 treatment group and the —1 score of the 
control group did not reach significance 
(t = 1.66). The differences between the —1 
score of the —1 treatment group and the 
—1 scores of the +1 and the +2 treatment 
groups were both significant (Table 1). 

Using Dunnett ¢ tests, comparisons of the 
+2 score of the +2 treatment group with the 
+2 scores of the control group (#<1), of 
the —1 treatment group (¢= 1.16), and the 
+1 treatment group (¢ < 1), indicated that 
the +2 treatment did not show a significant 
effect. 

Congruent with the hypothesis, the con- 
trol group and the +2 treatment group 
showed the greatest usage of the dominant 
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stage (0 scores). An analysis of variance com- 
paring the control and the +2 treatment 
groups on the one hand, with the —1 and 
+1 treatment groups on the other hand, 
showed a significant difference (F = 4.72, 
df = 1/32, p < .05). 

Conclusions regarding the direct scores. (a) 
The +1 treatment had a direct effect, an 
effect greater than that of either the —1 or 
+2 treatment. (b) Although not reaching 
an acceptable significance level, there was 
some suggestion that the —1 treatment had 
an effect in moving subjects down one stage. 
(c) The 4-2 treatment did not show a sig- 
nificantly greater effect than the control 
condition or the other experimental treat- 
ments in moving subjects up two stages. 


Indirect Scores 


The analysis of the indirect scores was 
similar to that of the direct scores. The in- 
direct score is not a rating of responses on 
the posttest, but rather a measure of change 
from pretest to posttest. For each subject's 
stage scores we subtractéd the pretest from 
the posttest scores and obtained change 
scores, 

As indicated by Table 2, the pattern of 
results of the indirect scores was consistent 
with the hypotheses and with the results on 
the direct scores. The evidence is only sug- 
gestive since significant findings were mini- 
mal. A one-tailed ¢ test (t = 2.70, p < .025) 
showed that the corrected mean of the +1 
treatment group (.052) was significantly 
larger than that of the +2 treatment group 


TABLE 2 


MEAN INDIRECT POSTTEST STAGE SCORES (IN PROPOR- 
TIONS) ON THE STAGES ONE BELOW (—1), THE 
Same As (0), ONE ABovE (+1), AND 
"Two ABOVE (+2) THE PRETEST 
DOMINANT STAGE 


Bisse devel Condition groupss 
dominant 1 
stage | treatment | treatment treatment | Control 
-—1 3-057 | +.001 | +.009 | -L.045 
0 —045 | —043 | —.022 | — 061 
+1 —.004 | +.045 | +.016 | —.007 
+2 | —016 | —002 | +.008 | +010 


a N = 11 in each group. 
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(—.002). The one-tailed 7 test of the differ- 
ence between the corrected means of the +1 
treatment group (.052) and the —1 treatment 
group (.012) reached a borderline level of 
significance (¢= 1.46, < .10). Although 
the +1 score of the +1 treatment group was 
larger than the +1 scores of the other groups, 
none of these differences was significant. No 
other relevant differences were significant. 


DISCUSSION 


The analysis of the direct scores showed 
that the +1 treatment was the most effective 
of the three treatments, with the +2 treat- 
ment being the least effective. The similarity 
between the patterns of the indirect and the 
direct scores suggests that the differential 
effect of the treatments represented something 
more than memorization of the specific ver- 
balizations used in the treatments and that 
some change occurred in generalized moral 
concepts. This conclusion remains tentative 
since the results on the indirect scores were 
minimally significant and since the same 
interview form was used in the test-retest 
procedure. 

The findings support Kohlberg's schema of 
stages as representing a developmental con- 
tinuum, in which each individual passes 
through the stages in the prescribed sequence. 
If the stages do form a developmental 
sequence, then it should be easier for subjects 
to understand and utilize concepts that are 
directly above their dominant stage than 
concepts that are two stages above. 

The developmental interpretation is also 


strengthened by the finding that subjects ' 


assimilated the next higher stage more readily 
than the lower stage, even though they could 
understand the concepts of the lower stage 
as well as, if not better than, those of the 
higher stage. Hence, we have an indication 
that the attainment of a stage of thought 
involves a reorganization of the preceding 
modes of thought, with an integration of each 
previous stage with, rather than an addition 
to, new elements of the later stages. 


Causal Factors of Changes in Stage 


The subjects exposed to the stage one 
above their dominant stage did learn to use 
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« ,some new modes of thought. A factor causing 


the use of new modes of thought may be 
cognitive conflict. Indeed, Smedslund’s work 
with the concept of conservation (Smedslund, 
1961a, 1961c, 1961d) indicates that cognitive 
conflict may lead to reorganization of struc- 
ture. The concept of cognitive conflict is 
similar to the concept of disequilibrium, 
which Piaget and Inhelder have presented 
rather obscurely (Inhelder & Piaget, 1958; 

` Piaget, 1950). They seem to be saying that 
movement from one structure to the next 
occurs when the system, by being challenged, 
is put into a state of disequilibrium. Thus 
change in structure would involve the estab- 
lishment of a new equilibrium after the 
occurrence of disequilibrium. 

Such a viewpoint is relevant to our study. 
Since subjects were exposed to new modes of 
thought through arguments justifying both 
sides of a moral conflict, they did not really 
receive solutions to the problems. Such a 
situation, which exposed subjects to cogent 
‘reasons justifying two contradictory positions, 
*could have resulted in cognitive conflict 
arising from an active concern with both sides 
of the issue. When the arguments were too 
“simple,” as in the —1 treatment, the sub- 
jects may not have become actively involved. 
When the arguments were too “complicated,” 
as in the +2 treatment, the subjects may 
not have understood them. However, exposure 
to concepts one stage above, concepts within 
a subject’s grasp, allowed him contact with 
new contradictory ideas requiring thought. 
Perhaps coping with concepts that had some 

' meaning to the subjects led to new modes 
of thinking, or to a greater use of the stage 
that was one above the initial stage. 


Related Studies 


A study having a direct relation to the 
present research, by Bandura and McDonald 
(1963), attempts to demonstrate that Piaget's 
(1948) sequence of moral development 
changes is a function of reinforcement con- 
tingencies and imitative learning. The study 
assumed that Piaget’s stages of moral de- 
velopment could be defined as a stage of 
“objective responsibility” (moral judgment in 
terms of the material damage or conse- 
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quences), followed by a stage of “subjective 
responsibility” (judgment in terms of inten- 
tion). Following one of Piaget’s procedures, 
Bandura and McDonald assigned children to 
stages in terms of responses to paired storied 
acts, one being a well-intentioned act result- 
ing in considerable material damage, and the 
other a maliciously motivated act resulting in 
very little material damage. 

Their experimental treatments attempted 
to influence the subjects by reinforcing adult 
models who expressed judgments in opposition 
to the child’s orientation, and by reinforcing 
any of the child’s own responses that run 
counter to his dominant mode. Two measures 
of learning of the opposite orientation were 
obtained: the amount of learning during ex- 
perimental treatment, and a posttest response 
to new stories immediately following the 
treatment. They showed that children could 
be influenced to judge on the orientation op- 
posite to their initial one. Bandura and Mc- 
Donald viewed this evidence as “throwing 
considerable doubt on the validity of a devel- 
opmental stage theory of morality." 

An adequate test of a stage theory of 
morality must deal with stages that are truly 
representative of mental structure rather than 
with specific verbal responses. Empirical tests 
of Piaget’s moral judgment theory indicate 
that the stages do not meet the necessary 
criteria (Kohlberg, 1963b). However, the 
Bandura and McDonald study does not pro- 
vide an adequate test of Piaget's theory be- 
cause his two stages are not those of objective 
and subjective responsibility, but rather are 
those of heteronomous and autonomous orien- 
tations, The heteronomous and autonomous 
stages are each represented by 11 observable 
aspects (Kohlberg, 1963b) of children’s defi- 
nitions of right and wrong, of which the 
dimension of objective-subjective responsi- 
bility is only one. By studying only one 
dimension as manifested in children’s choices 
between two alternatives Bandura and Mc- 
Donald dealt with isolated surface responses, 
and not with the concept of stage or mental 
structure. In their experimental treatment 
one of two possible answers was reinforced 
Therefore, the induced changes did not repre- 
sent underlying structures, but instead repre- 
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sented switches to what the subjects thought 
were the correct answers." 

Another important deficiency in their pro- 
cedure was the administration of the posttest 
immediately after the experimental treatment. 
As Smedslund (1961b) has demonstrated, the 
test of duration over time is a main criterion 
for distinguishing between cognitive structure 
and superficially learned responses. There was 
a small decrease in subjective responses given 
by objective children from the experimental 
treatment to the posttest, while there was no 
such decrease in the objective responses of 
subjective children, This finding, that down- 
ward movement was more stable than upward 
movement, is in contrast to our findings, in 
which upward movement was more stable. It 
is not surprising that the learning of surface 
verbal responses related to a lower stage can 
be retained for a short time. It is interesting 
that the learning of responses related to a 
higher stage was not entirely retained, even 
for such a short period of time. 

In the present research we have worked 
with responses assumed, to reflect mental 
structure and have found that the concept 
of developmental stage or mental structure 
has much relevance to the understanding of 
children's moral thinking. We suggest that 
the effectiveness of environmental influences 
depends on the relation between the type of 
concept encountered and developmental level. 


6 It must be pointed out that in the contrasting 
pairs of stories a well-intentioned act always resulted 
in much material damage, while the maliciously 
motivated act always resulted in little material 
damage. A child in the objective Stage could easily 
have learned to more frequently designate, as being 
worse, that story which contained less material 
damage, thinking that it was the expected answer. 
Thus he could have given the "higher stage" answer 
without having learned the concept of intention. 
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SUBLIMINAL PERCEPTION OR PERCEPTION OF PARTIAL 
CUE WITH PICTORIAL STIMULI* 


GERALD GUTHRIE Ax» MORTON WIENER 
Clark University 


3 experiments were undertaken to investigate the ostensible subliminal effects 
found with *below threshold" exposures of pictorial stimuli. Following the as- 
sumptions of the “part-cue response-characteristic” model, it was hypothesized 
that: (a) pictorial stimuli used in earlier studies differed in amount of struc- 
tural attributes available to Ss; (b) such structural cues are responded to 
differentially by Ss; and (c) with the thematic content held constant, sys- 
tematic difference in perceptual behavior is a function of variations in struc- 
tural attributes (e.g, angularity) by the subliminal stimuli. The results con- 
firmed these hypotheses, and the part-cue response-characteristic model re- 
mains a tenable explanation of the perceptual behavior ascribed to subliminal 


perception. 


Three kinds of explanations are typically 
offered to account for intra- and interindi- 
vidual differences in responses to stimuli 
ostensibly presented below threshold. One 
explanation (e.g., Lazarus & McCleary, 1951) 
invokes a subliminal process which can dis- 
criminate or *know" the threatening or need- 
related stimulus and inhibits or facilitates 
recognition even though the individual is un- 
aware of the stimulus. A second view (e.g., 
Goldiamond, 1958; Howes, 1954) posits that 
the differences in recognition threshold found 
for various classes of words can be attributed 
solely to differences in response parameters, 
such as the probability or frequency of usage 
of words, and that stimulus effects are not at 
issue. The third (e.g, Eriksen & Browne, 
1956; Kempler & Wiener, 1963, 1964; 
Wiener?), similar to the response probability 
view, holds that a subliminal process is un- 
necessary to account for the findings, but, in 
contrast to the response probability view, 
asserts that the so-called subliminal stimuli, 
in fact, provide part-cues to which subjects 
respond in a predictable manner. In this 


1This study was supported in part by Grant 
M-3860 from the National Institute of Mental 
Health, United States Public Health Service. Ac- 
knowledgments and thanks go to Morris Eagle, who 
made his original stimuli and trait list available, 
and to the Journal of Personality for permission to 
reproduce Eagle’s stimuli. 

2'Subception or Perception of Partial Cue,” 
progress report and research proposal to the National 
Institute of Mental Health, United States Public 
Health Service, 1962, 


latter view, both stimulus input and the 
response characteristics of the subjects are 
used to account for the findings in studies 
investigating “subliminal effects.” * 

Most of the controversy about these 
alternative explanatory models has centered 
around findings derived from studies in which 
words were the stimuli. However, in several 
studies, pictorial stimuli have been used to 
investigate subliminal effects (e.g. Eagle, 
1959; Klein, Spence, Holt, & Gourevitch, 
1958). In addition, in these latter studies, the 
effects of the “subliminal” stimuli were inves- 
tigated by assessing the subjects’ affective 
responses on an adjective check list, rather 
than by recognition or identification responses. 

In Eagle’s (1959) study* three different 
stimulus pictures were used: a man with a 
knife attacking another man (aggressive 
picture); the same man handing a cake to 
another, seated man (benevolent picture) ; 
and the same man (attacker and giver) 


8A fuller discussion of the underlying issues in 
these three views is available in Wiener (see Foot- 
note 2) and Kempler and Wiener (1963). 

4Eagle’s study and stimuli were used as the 
exemplar in these investigations because his findings 
could be interpreted as being consistent with a sub- 
ception explanation. In contrast, in other studies 
using pictorial stimuli (eg. Klein et al., 1958), it is 
unclear whether “subliminal” effects could be in- 
ferred. A close reading of these latter studies indi- 
cates that the results seem to be a function of 
response consistency by subjects across stimuli, 
rather than a function of kinds of stimuli presented 
subliminally to the subjects. 
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Fic. 1. Eagle’s stimuli. (Top, neutral; middle, 
benevolent; bottom, aggressive.) 


standing with his hands to his side (neutral 
picture; see Figure 1). Subliminal exposure 
was established by use of a masking tech- 
nique. In this technique, when one picture 
(the A stimulus) is exposed briefly—at a 
duration normally sufficient to see and 
identify—and followed by a second picture 
(B stimulus), then the A is “not seen,” 
yet it appears to have some effect upon the 
perception of the B stimulus (Werner, 1935). 
The aggressive or benevolent picture was pre- 
sented as the A stimulus and the neutral 
picture as the B stimulus. Eagle found that 
subjects’ responses on a personality trait list 
about the man in the neutral picture varied 
systematically with the different A stimulus 
pictures. There were more negative person- 
ality traits (ie, “harmful,” “unpleasant,” | 
etc.) attributed to the man in the neutral 
picture when the A stimulus was the aggres- 
sive picture than when it was the benevolent 
picture. Eagle interpreted the findings as 
being consistent with the subliminal process 
view. 

Invoking the “part-cue response-character- 
istic’ view to account for Eagle’s data re- 
quires two assumptions: some cues are avail- 
able, and the subjects respond to these avail- 
able cues in a specifiable way.® In a recent 
experiment, Schiller and Wiener (1963) 
found that the amount of information avail- 
able from an A stimulus in this masking pro- 
cedure is a function of its particular con- 
figuration and exposure time, as well as that 
of the B stimulus. Insofar as the stimuli in 
studies investigating subliminal effects (par- 
ticularly Eagle’s) were very complex, and 
each study employed a range of A exposure 
durations, it seemed highly likely that some 
cues were available to the subjects, even 
though these cues might be insufficient for 
the subjects to identify or be aware of the 
“meaning” of the A stimulus. Examination of 
the stimuli used by Eagle suggested that 


5 While the response probability view can account 
for the recognition threshold data for words, it is 
not clar how this view can handle findings such 
as Eagle's. Without stimulus input of some kind, it 
is difficult to understand the variations between 
subjects in responses for the different A stimuli. Two 
groups of subjects having the same response biases 
would zot be expected to give different responses 
under ostensible conditions of no stimulus input. 
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SUBLIMINAL PERCEPTION 


certain structural attributes (thickness of line 
and amount of angularity of line) were dif- 
ferentially present in his two A stimuli. The 
fact that these same kinds of attributes are 
used to represent different personality states 
or characteristics in cartoons suggested that 
if these structural attributes were, in fact, 
available, the subjects would respond to them 
in predictable ways. 

In sum, the part-cue response-characteristic 
explanation posits that certain line cues are 
available to the subjects in studies such as 
Eagle's, and that subjects respond differen- 
tially to these cues (i.e., positively or nega- 
tively). It was hypothesized that: 

1. Subjects respond predictably to varia- 
tions in structural attributes of lines. 

2. The A stimuli (in this instance the spe- 
cific drawings employed by Eagle) differ in 
the amounts of these structural attributes 
available to the subjects. 

3. Under conditions with content (*mean- 
ing") controlled, variations in structural at- 
tributes of line are accompanied by system- 
atic variations in response under so-called 
“subliminal” exposure. 

'These hypotheses were investigated in a 
series of experiments. 


EXPERIMENT I 


This experiment investigated whether sub- 
jects responded predictably to variations in 
structural attributes (angularity-curvedness 
and thickness-thinness) of line. The hypothe- 
sis of Experiment I was: Lines with different 
attributes (angular-curved, thick-thin) are re- 
sponded to differentially in a negative-positive 
dimension with angular lines, negative; 
curved lines, positive; thick lines, negative; 
and thin lines, positive. 


Method 


Subjects. A total of 24 subjects were used—9 males 
and 15 females. The subjects were all members of 
an undergraduate introductory psychology class. 

Material, The stimuli consisted of four different 
line drawings, each drawn with black India ink on 
a 5X8 inch white card. The drawings, shown in 
Figure 2, were: a light, angular line (LA); a heavy, 
angular line (HA); a light, curved line (LC); and 
a heavy, curved line (HC). The lines were equated 
for length and general configuration—that is, each 
Jine consisted of the same spacing, number, and 
height of irregularities. The light lines (or thin 
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Fic. 2. Line drawings used in Experiment I. 
(From top to bottom the lines represent light 
angular, heavy angular, light curved, and heavy 
curved.) 


© 
lines) were drawn with an Esterbrook pen point 
No. 2556 and the heavy (thick) lines with a No. 05 
Speedball pen. 

Trait list. The trait list for judging the drawings 
was taken from the list compiled by Eagle (1959). 
Only 59 trait dimensions (a trait and its opposite, 
ie, “pleasant-unpleasant,” “cruel-kind”) were used 
in the actual scoring. Selection of items was based 
upon the independent assessment of 10 judges, all 
graduate students in psychology. The judges were 
given Eagle’s original trait list and asked to indicate 
which member of each trait dimension was negative. 
Only those traits for which there was agreement 
among 8 of the 10 judges were utilized in the later 
scoring, Each trait dimension was ordered on a 
6-point scale: "very?" “somewhat,” and "slightly" 
for one trait, and “very,” “somewhat,” and “slightly” 
for its opposite. Subjects were instructed to choose 
1 of these 6 possible judgments for each trait. 

Procedure. The subjects were seated in a semi- 
darkened room in front of a tachistoscope viewer. 
The tachistoscope, built by Scientific Prototype 
Manufacturing Company, has three channels and 
is electronically triggered. The subjects were given 
the following instructions: 


You are going to be shown a few line drawings. 
Your task will be to make judgments about the 
drawings using a list of trait-dimensions which I 
will give you. Each list has a number of different 
trait-dimensions. For example, long-short, low- 
high, and slow-fast are trait-dimensions. There will 
be six possible judgments you can make with each 
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trait-dimension. For example, with the long-short 
dimension you may judge the line-drawing to be 
either “very long,” “somewhat long,” “slightly 
long,” “very short,” “somewhat short,” or “slightly 
short.” You pick one, and only one, judgment for 
each trait-dimension on the list and mark the 
appropriate box. Here is a sample sheet showing 
what I mean. [At this time the examiner showed 
a two-thirds completed list using the trait- 
dimensions mentioned. The procedure was again 
explained, and the subject was asked to point to 
the different possible judgments.] On the table in 
front of you is a button which allows you to con- 
trol the presentation of the drawings in the 
viewer. Your procedure will be to look into the 
viewer and, when you are ready, press the button 
and look at the drawing, and mark. Work in this 
manner until you have completed all three pages 
of the list and all trait-dimensions have one judg- 
ment indicated. When you are finished with the 
list, let me know. Are there any questions? Here 
is the first list. Begin when you are ready. 


Each press of the subject's control button pre- 
sented the drawing for 3 seconds, Every subject 
judged all four line drawings. Each of the 24 sub- 
jects received a different one of the 24 possible 
orders of the line drawings. 


Results and Discussion 


The data from the trait lists were analyzed 
by scoring the total number of negative judg- 
ments made by each subject to each of the 
stimulus line drawings (LA, LC, HA, HC). 
Each subject had four such scores (one each 
for the LA line, the LC line, the HA line, 
and the HC line). An analysis of variance 
was computed on these data using a Treat- 
ment X Treatment X Subject design (Lind- 
quist, 1953, p. 237). The results of this 
analysis indicated a difference as predicted 
between angularity and curvedness of line. 
This difference was significant at less than 
the .01 level (F — 40.32, df — 1/23). The 
heaviness-lightness factor was not signifi- 
cant at the accepted .05 level (F = 4.26, df 
= 1/28, p < .06). The interaction of the two 
factors LH, and CA was not significant 
(F = .61, df = 1/23). 


5A second method of analyzing the data was b; 
weighting the judgments of “very,” “somewhat,” id 

slightly.” In this weighting, responses to positive 
traits were multiplied by 3, 2, and 1 (with 3 used 
for very, 2 for somewhat, and 1 for slightly; —3, 
—2, and —1 were assigned to the negative traits). 
These scores for each subject were combined and a 
constant added to remove negative scores. In 
Experiments I and III, where this weighting score 


The results of Experiment I support the 
hypothesis. Subjects respond systematically 
to different structural attributes of line. The 
findings indicate the subjects assign negative 
traits to angular lines and positive traits to 
curved lines, The next step was to determine 
whether there are systematic differences in 
the availability of angular cues in Eagle’s 
stimuli. 


EXPERIMENT II 


Eagle, in his discussion of his study, ob- 
served that when his stimuli were presented 
in an ascending series starting below thresh- 
old, subjects first reported the presence of 
lines which are jagged, angular, etc., for the 
aggressive stimulus. This observation is con- 
sistent with the assumptions of the part-cue 
response-characteristic model, but required 
systematic investigation. It was hypothesized 
that when Eagle's stimuli—the aggressive and 
benevolent scenes—are presented with a 


method of ascending limits, the initial in- , 


formation available to the subjects from the 
aggressive stimulus is different from ` that 
available from the benevolent stimulus on 
the angular-curved dimension. 


Method 


Subjects. A total of 12 subjects from an intro- 
ductory psychology class were used—5 females and 
7 males. 

Material. Stimuli: The stimuli were 5X7 inch 
photographic copies of the original stimuli used by 
Eagle. The prints were processed professionally and 
were equated with the originals for density and value 
as far as possible. 

Judging sheet: Each judging sheet used by the 
subjects consisted of two parts. The first part was 
a rectangle labeled “Location.” The subject was 
instructed to reproduce what he had seen by drawing 
it in the rectangle, locating it as it was on the view- 
ing screen. The second part of the judging sheet, 
labeled “Description,” had a list of seven trait di- 
mensions (“long-short,” "curved-angular," “sharp- 
dull,” “line-mass,” “soft-hard,” “jagged-rounded,” 
“heavy-light”). Each of the seven trait dimensions 
was ordered on a 5-point scale: “very” and “slight” 
for a trait and “very” and “slight” for its opposite. 
*Doesn't apply" was the fifth point on the scale. 
At the bottom of the judging sheet was the question, 
“Does it look like anything, if so, what?” 


was also used, the results were the same as those 
found with the scoring of number of negative 


judgments. 
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. Procedure, The subjects were seated in front of 
the tachistoscope in a semidarkened room and given 
the following instructions: 


I have two pictures that I am going to show 
you. I am going to give you the pictures right 
now and let you look at them for a moment. 
[The experimenter hands pictures to the subject.] 
The pictures will be presented very quickly, so 
quickly, in fact, that you may mot see them at 
first. Because of this, you will have to be atten- 
tive and concentrate. [The experimenter takes 
pictures back from the subject.] I will start with 
one of these pictures and present it to you in the 
viewer. You report to me if you see anything. 
You will probably see a flash of light, but only 
report if you see lines, objects, etc. If you do see 
something, then you take one of these sheets, 
number it, and attempt to describe what you saw 
according to the categories. The first judgment you 
will make concerns the location of whatever you 
saw. Try to reproduce what you saw in the 
appropriate part of this square, or if you cannot 
reproduce it, then just mark its approximate 
location. If you saw more than one thing, then 
mark the position of as many as you can. In the 
second portion of this sheet try to describe what 
you saw by marking these traits. There are five 
possible judgments you can make for each trait. 
For example, with the “long-short” trait- 
dimension, you may decide that what you saw was 
either “very long,” “slightly long,” “very short,” 
“slightly short” or that this trait doesn’t apply 
. to what you saw. Mark the one judgment you 

decide on for that trait and go to the next. Con- 
tinue in this manner until you have marked all 
the traits, Finally, try to answer the question at 
the bottom. OK? Any questions? 


The two copies of Eagle's stimuli were inserted 
into the stimulus card holders of two separate chan- 
nels of the tachistoscope. The third channel of the 
tachistoscope was the fixation field with a blank 
card inserted in it. The subject's fixation field was 
a homogeneously lighted field. A channel selector 
enabled the experimenter to select the stimulus pic- 
ture which was to be exposed. Each stimulus picture 
was exposed in an ascending series, starting at 10 
milliseconds, The exposure times were increased in 
2-millisecond steps until the subject reported seeing 
the “something” (a line, a blob, a figure—anything). 


« While the subject was marking the judging sheet for 


the reported something the experimenter switched 
the channels, Thus, the next stimulus picture the 
subject looked at to judge was the stimulus picture 
occupying the next position for that subject’s series. 
The channel switch was accompanied by an audible 
click, and the selector knob was moved a number of 
times after each presentation of the stimulus picture 
so the subject would be unaware when an actual 
change of picture took place. 

Each stimulus picture was exposed in an ascending 
series until two reports of seeing something were 
indicated by the subject. Each of the two A stimuli 

. 


was shown twice to every subject. Thus, a subject 
completed a total of eight judging sheets, two for 
each separate showing of the stimulus picture. All 
of the six possible orders of the two A stimuli 
(1, aggressive; 2, benevolent) were used (1221, 1122, 
1212, 2121, 2211, 2112), two subjects in each order. 


Results and Discussion 


The three trait dimensions, angular-curved, 
sharp-dull, jagged-rounded, were analyzed. 
Markings of the traits angular, sharp, and 
jagged were scored as angular judgments, and 
markings of curved, dull, and rounded were 
scored as curved judgments. With three 
critical trait dimensions on each sheet, a sub- 
ject had a possible 12 angular response score 
or a possible 12 curved response score to each 
stimulus picture, Variations from this maxi- 
mum possibility occurred when the subjects 
used the judgment *Doesn't apply.” A £ test 
indicated a significant difference in the pre- 
dicted direction (£ = 4.69, df = 22, p < .05) 
between the number of angular judgments 
made to the aggressive and benevolent stimu- 
lus pictures. The difference between the num- 
ber of curved judgments made to the stimulus 
pictures was not significant. 

The results of Experiment II again support 
the hypotheses. When Eagle's stimuli (aggres- 
sive and benevolent) were exposed in an 
ascending series (starting below threshold), 
the first information reported by the subjects 
differed for the two stimulus pictures. As 
expected, the responses of the subjects indi- 
cated that the initial information available 
for the aggressive picture was more angular 
than the initial information available for the 
benevolent picture. 

These findings support the  part-cue 
response-characteristic explanation of Eagle's 
data; that is, the picture stimuli used by 
Eagle have different amounts of angular at- 
tributes available in a “subliminal” exposure 
series, with the aggressive stimuli having more 
angularity. In this context, the results of 
Experiment I indicate that were such cues 
available, the subjects are more likely to 
judge them negatively on the trait list. 


EXPERIMENT III 


The next step was to try to demonstrate 
that with the structural characteristics of the 
stimuli systematically varied, predictable dif- 
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ferences in responses would occur. It was 
hypothesized that, holding the thematic con- 
tent constant, judgments of a B stimulus in 
a “masking” situation can be varied as a 
function of the structural attributes of the 
A stimulus; and where the line attributes 
are held constant but the thematic content 
of the stimuli varied, judgments of the B 
stimulus in a masking situation remain 
constant. 

This experiment consisted of three parts: 
a preliminary investigation to determine the 
characteristic responses by subjects to the 
“neutral” stimulus to be used in the succeed- 
ing experimentation; an investigation to de- 
termine if the chosen A stimuli are rated 
differentially when viewed supraliminally; 
and an investigation of the effects of struc- 
tural differences in pictorial stimuli in a 
“subliminal” perception situation, with con- 
tent held constant. 


Method: Part 1 


Subjects. This experiment was group administered 
to 20 students in an introductory psychology class. 
° 


Fic. 3. A and B stimuli used in Experiment III. 
(Top is the neutral B stimulus. Next, from left to 
right, is A stimulus curved without gun, and A 
stimulus angular without gun. Bottom row presents 
A stimulus curved with gun, and A stimulus angular 
with gun.) 
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Of these 20, 1 subject’s protocol was eliminated 
because of his misunderstanding of directions and 
a second was eliminated because she had partici- 
pated as a subject in Experiment I. A total of 18 
subjects were used in the final analysis—9 males and 
9 females. 

Material. Stimulus: The stimulus was a black 
India ink drawing (Esterbrook pen point No. 2556) 
ona 5X7 inch white card. The drawing was pro- 
jected by an opaque projector onto a white beaded 
screen approximately 10 feet away from the subjects. 
The projected image was approximately 36 X 40 
inches. The drawing was a seated man (Figure 3) 
composed entirely of lines without corners, angles, 
or intersections, This same drawing served as the 
neutral stimulus for Part 2 of this experiment. 

Trait list: The trait list and scoring was the same 
as that used in Experiment I. 

Procedure. The lists were distributed, and the class 
was instructed as follows: 


I am going to show you a drawing of a man. 
Your task will be to make judgments about the 
man using the list of traits which I have given 
you. You notice that the list is made up of 
several trait-dimensions. For example, “pleasant- 
unpleasant,” “helpful-harmful” are two trait 
dimensions. For the trait-dimension “pleasant- 
unpleasant” you may judge the man to be “very 


pleasant,” “somewhat pleasant,” “slightly pleasant,” ; 


"very unpleasant,” “somewhat unpleasant,” 
“slightly unpleasant.” You make one, and only 
one, judgment for each trait-dimension on the list. 
[The blackboard was used to demonstrate a 
sample scoring sheet at this time.] Do all the 
trait-dimensions and work until you have finished 
all three pages. OK? Any questions? 


The room was then darkened and the picture pro- 
jected. The subjects began work immediately, and 
the papers were collected as each finished. The 
stimulus was projected continuously until every 
subject finished his trait list. 


Results and Discussion 


The data from the trait lists were analyzed 
by counting negative and positive responses. 
A subject’s score was the difference between 
the total number of negative and the total 


number of positive responses made to the. 


man. The subjects showed no apparent differ- 


ences in the number of positive or negative 


judgments (4 = .25). 

TThis part of the experiment was designed 
to determine whether the neutral B stimulus 
was, in fact, responded to as neutral on the 
trait list. The findings of no significant dif- 
ferences between the number of positive or 
negative responses to the B figure indicated 
that the B stimulus could be used to test 
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the effects of the A stimulus in the masking 
paradigm without apparently introducing any 
systematic biases in the judgment of the A 
stimuli, The next step was to investigate the 
subjects’ responses to the A stimuli when 
they are presented supraliminally. 


Method: Part 2 


Subjects. A total of 52 subjects were seen in four 
groups. Three subjects misunderstood directions, and 
their judgments could not be scored. The data from 
the remaining 49 were used in the statistical analyses. 

Material. Stimuli: To make the four A stimulus 
pictures which were used in this experiment (Figure 
3), the neutral picture was reproduced, and then 
for one set of pictures the lines were joined making 
angular connections throughout, and for the second 
set making curved connections. A gun (cutout of 
a drawing in India ink) was pasted in the hand for 
two of the A stimulus drawings. Photographs, 5 X 7 
inches, were made of the two drawings, one set 
without guns, then one set with guns. The final 
four A stimulus prints were: angular without gun, 
angular with gun, curved without gun, curved with 
gun. Xerox copies of these four original stimulus 
pictures were made, and each subject received one of 
the four pictures. 

Trait list: The trait list was the same as that 
used in Experiment I. 

Procedure. The subjects were given the following 
instructions: 


Iam going to give each of you a drawing of a 
man. Your task will be to make judgments about 
the man using the list of adjective-dimensions 
which are attached to it. You will notice that the 
list is made up of several adjective-dimensions. 
For example, *pleasant-unpleasant," “helpful- 
harmful” are two adjective dimensions, For each 
of these dimensions you have six possible judg- 
ments. For example, in the “pleasant-unpleasant” 
dimension you may judge the man to be “very 
pleasant,” “somewhat pleasant,” “slightly pleasant,” 
“very unpleasant,” “somewhat unpleasant,” or 
“slightly unpleasant.” You choose one of these 
six possible judgments, mark it and go on to 
the next. Do all the adjective-dimensions and 
finish all three pages. Is that clear? 


The trait lists were collected as each subject finished 
the task, 


Results and Discussion 


The data from the trait lists were analyzed 
as in Experiment I. A subject’s score was the 
total number of negative responses made to 
the man. These scores were analyzed in a 
2x2 factorial analysis of variance. There 
was a significant (F = 25.34, df = 1/45, P 
< .05) difference in the predicted direction 
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for the total number of negative responses 
made to the gun versus no-gun stimuli when 
these are presented supraliminally. There 
was also a significant (F = 6.84, df = 1/45, 
p < .05) difference in the predicted direction 
for the total number of negative responses 
made to the angular versus curved stimuli. 

The results clearly indicate that the A 
stimulus pictures with gun, assumed to be 
thematically negative, are responded to as 
such by subjects when shown supraliminally. 
Failure to find differential responses to the 
thematically varied A stimuli in subliminal 
presentations cannot be explained away by 
assuming that the “gun” stimuli do not evoke 
more negative responses, at least as inferred 
from supraliminal responses, The next step 
was to investigate subjects’ responses when 
the A stimuli are presented subliminally. 


Method: Part 3 


Subjects. A total of 36 undergraduate volunteer 
subjects were used—18 male and 18 female. 

Material. Stimuli: An equilateral triangle (one 
side = 1.75 inches) drawn with a No, 2556 Ester- 
brook pen and blacke India ink on a white 5X7 
inch card was the stimulus for Step 1. 

The neutral drawing of the seated man, used in 
Part 1 (stimulus picture B), was the stimulus for 
Step 2. 

All of the five stimulus pictures were used in 
Step 3; the neutral drawing B of the seated man, 
used in Part 1, and the four different A stimulus 
pictures used in Part 2. 

Trait list: The trait list used in Step 4 was the 
same as that used in Experiment I. 

Procedure. The subject was seated in front of the 
tachistoscope viewer in a semidarkened room, The 
procedure consisted of five steps: 

Step 1: The subject’s recognition threshold for 
the equilateral triangle was determined. This thresh- 
old value established the levels at which the A 
stimulus was to be exposed in the later parts of the 
experiment. The subject was instructed as follows: 


I am going to show you something in the 
viewer. At first it will be presented very quickly, 
so quickly, in fact, that you will, probably not 
see anything. However, I will continue to present 
it and if you concentrate and pay attention, you 
will soon be able to make out what it is. I will 
say “ready” and then show you the picture. You 
tell me if you saw anything and, if you did, what 
you saw. OK? 


The equilateral triangle was presented to the subject 
in an ascending series, starting at 15 milliseconds and 
increasing in 2-millisecond steps, ending with the 
subject’s threshold of recognition for the triangle 
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(he, the subject's verbal report of having seen a 
~ triangle). 
Step 2: This procedure was designed to obtain a 
base-line description of the neutral B stimulus. The 
subject was instructed as follows: 


Now I am going to show you a picture of a 
person. I want you to describe him to me briefly— 
what he looks like, what he is doing, thinking, and 
so on. The picture will go on for a very short 
time, so watch carefully. 


The neutral B drawing of the seated man was pre- 
sented for .45 second. This .45-second interval was 
determined as the result of pilot work which indi- 
cated that this exposure gave the subjects adequate 
time to see the picture and ostensibly induced the 
masking of the A stimulus. The subject's verbal 
description of the neutral B stimulus picture was 
recorded.7 

Step 3: The A stimulus pictures were introduced 
and the instructions emphasized “slight and subtle 
changes” in the B stimulus. The subject was 
instructed as follows: 


I am going to show you a picture of a person 
very much like the one you just saw. But from 
time to time I will introduce some very slight 
and subtle changes into the picture. I’d like you 
to tell me if you notice them. Again, briefly de- 
scribe the person to me—what he looks like, 
what he might be doing and thinking. On oc- 
casion you may feel a change in mood about the 
picture without being able to say exactly why. 
When this happens, let me know. Don’t worry 
about being too accurate. OK? 


Each A stimulus was presented to nine different 
subjects—one A stimulus per subject. It was exposed 
for a short interval and then was followed, with 
no interval between, by an exposure of the B stimu- 
lus for .45 second. One such A-B sequence was 
considered to be one presentation. The exposure 
times for the A stimulus were determined by com- 
puting the 40, 60, 80, 100, 120, 140, 160, and 180% 
levels of the subject's triangle threshold. After every 
two presentations the stimulus cards were rattled 
in the tachistoscope to encourage the idea of changes 
being initiated. The subject’s reports of change in 
the B stimulus were recorded. 

Step 4: The subject was given the trait list and 
instructions were as follows: 


i Now, I would like you to rate the person you 
just saw, the impression you got of him as a 


TAI of the procedures in this experiment were 
similar to those used by Eagle. When a variation 
was introduced, it was done to establish the same 
psychophysical conditions that Eagle reported. For 
example, in this study the B stimulus was exposed 
for 45 second while Eagle exposed his B stimulus 
for .33 second. In this study .45 second created the 
same degree of masking for these stimuli as Eagle 
reports for .33 second for his stimuli, 
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person. You may have seen him in different ways, 
in different moods, or with different expressions, 
but rate him on the basis of the dominant im- 
pression you got of him as a person. [Instructions 
for judging were given as in Part 1.] Don't spend 
too much time on each trait. Do it on the basis 
of your first impression. 


The trait list used for this step was the same as 
that employed in Experiment I, 

Step 5: The instructions attempted to direct the 
subject to any possible information he may not have 
reported in Step 3. The instructions were as follows: 


Now, I am going to show you the same picture 
of a person over and over again. I'd like you to 
be very attentive because this time, after a while, 
you will begin to see something else in addition 
to the regular picture. Whenever you see some- 
thing else, let me know. Remember, tell me every- 
thing you see, Any questions? 


Each subject again received the A-B sequence in 
an ascending series starting at his 20% threshold 
for the triangle. Increments in exposure time for the 
A stimulus were 2 milliseconds. The presentations 
continued through two levels: the subject’s first 
report of having seen "something" (a line, a blob, 
etc.), and the subject’s positive identification of the 
A stimulus. 


Results and Discussion 


The trait lists used in Step 4 were scored 
on the basis of number of negative responses. 
The total number of negative responses made 
to the B stimulus when the A stimulus was 
angular and when the A stimulus was curved 
was analyzed using a 2 X 2 factorial analy- 
sis of variance. There was a significant 
(F = 5.30, df = 1/32, p « .05) difference in 
the predicted direction for the total number 
of negative responses made to the angular 
versus curved stimuli. The difference in nega- 
tive judgments between the gun-no-gun fac- 
tor was not significant (F = 2.23, df = 1/32) 
nor was the interaction of the two factors 
(F = .05, df = 1/32). 

Table 1 shows the distribution of the mean 
number of negative judgments for the four 
conditions, Although the F value in the analy- 


TABLE 1 


MEAN NUMBER or NEGATIVE JUDGMENTS FOR THE 
Foor A STMOLI in Experrment III 


Angular Curved 
No gun 28.0 19.1 
Gun 353 24.5 
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» .Sis of variance of the gun-no-gun variable 
was not significant, the mean for each gun 
picture is larger than its corresponding no- 
gun picture. These data were analyzed further 
by the weighted scores to determine whether 
there were any effects with a more sensitive 
measure. Here the differences were even less 
reliable (F — .74, df — 1/32). There seems 
to be no evidence that the difference between 
means for gun-no-gun can be considered a 

` reliable one. 

The results of Experiment III are also 
consistent with the expectations of the part- 
cue response-characteristic model. Changes in 
trait judgments about the B or neutral stimu- 
lus in the masking situation were systemati- 
cally related to the structural attributes of 

_ „the A stimulus, despite identity of thematic 
content, The number of negative judgments 
made to the B stimulus differed depending 
upon whether the A stimulus was angular or 
curved, Further, trait judgments for the B 
stimulus showed no significant differences 
when thematic content was varied and struc- 
*tural cues held constant. The presence or 
absence of a gun in the A stimulus had no 
systematic effect on the number of negative 
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. ‘judgments made about the B figure. 


Each subject was also given a “threshold- 
exposure difference” score. This score was 
computed for a subject by subtracting his 
threshold (first report of having seen some- 
thing) for the A stimulus (Step 5) from his 
180% triangle threshold (Step 3). This score 
represented the difference between the sub- 
ject’s A stimulus threshold and longest 

~ exposure time of the A stimulus. A rank-order 
correlation of the difference scores and the 
subject's “change” score (i.e., the number of 
verbal reports indicating change in the B 
stimulus during the testing session) was 
-computed. The subject with the largest 
threshold-exposure difference received a rank 
of 1, and the subject indicating the greatest 

* number of changes received a rank of 1. A 
negative correlation (p — —.33, p < .05) was 
found. The smaller the threshold-exposure 
difference score, the more likely the subject 
had structural cues available at the higher 
exposure level used in Step 3, and the more 
likely was the subject to report perceived 
changes in the B stimulus during testing. 
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The threshold-exposure difference scores of 
the 18 subjects who received presentations of 
the angular A stimuli (with or without gun) 
were correlated (rank order) with the number 
of negative responses (on the trait list) given 
to the angular stimuli, The highest number of 
negative responses by a subject was ranked 1, 
and the highest threshold-exposure difference 
was ranked 1, A negative correlation was 
again found (p=.46, p< 05) indicating 
that the smaller the threshold-exposure dií- 
ference for a subject the more negative 
responses are given to the angular A stimulus. 
The closer the exposure of the A stimulus 
to the subject’s threshold for identifying the 
A stimulus, the more negative responses were 
given, at least when the A stimulus was 
angular, 


CONCLUSIONS 


The results in all of the experiments sug- 
gest that where pictorial stimuli are exposed 
at “subliminal” levels, structural cues (lines, 
etc.) are the first information available to 
the subject. These available structural cues 
vary, both in the amount and in the kind 
of attributes (e.g., angularity). The findings 
also indicate that subjects have character- 
istic positive and negative responses to these 
line attributes, whether they occur supra- 
liminally or subliminally. Experimental data 
in these investigations offer no contradictory 
evidence to the hypotheses and assumptions 
of the part-cue response-characteristic view. 
The findings in this study were derived from 
an intensive analysis of Eagle’s stimuli and 
procedures. To the extent that these findings 
can be generalized to other investigations of 
subliminal effects where pictorial stimuli were 
used, the part-cue response-characteristic 
view remains a tenable explanation for these 
other studies as well. It appears that the 
part-cue response-characteristic explanation 
can account for the so-called subliminal ef- 
fects without having to invoke a special proc- 
ess which responds to different classes of 
stimuli without the awareness of the subjects. 
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GROUP DISCUSSION AND PREDICTED ETHICAL 
RISK TAKING * 


SALOMON RETTIG 


Ohio State University 


Recent studies demonstrated that group discussion increases risk taking. These 
results were explained in terms of responsibility diffusion. The hypothesis 
tested here is that group discussion affects predictive judgments of unethical 
behavior in the same way as 2 other risk conditions: privacy and impersonality. 
160 Ss, equally distributed by sex and judgment conditions, predicted the be- 
havior of persons (S himself or hypothetical) in conflict about stealing. 16 
items, each varying the expectancy and reinforcement value of gain or censure, 
were judged. The findings support the hypothesis. However, the results sup- 
port an interpretation of censure testing rather than responsibility diffusion 


during group discussion. 


Mob action and the behavior of gangs 
and crowds seem to suggest a qualitative 


“transformation of individual standards of 


conduct when acting in groups. Socially un- 
desirable behaviors unlikely to take place in 
isolation can be observed in crowds. These 
observations have given rise to much specula- 
-tion and to some recent experimentation 
*with group-induced assertiveness in behavior. 
One particularly cogent view maintains that 
the group situation facilitates risk-taking be- 
‘havior, and recent experiments with group 
risk taking have tended to support this view. 
Wallach, Kogan, and Bem (1962, 1964) have 
consistently found that group decisions pro- 
duce shifts toward greater risk taking, 
whether the payoff is hypothetical or real. 
The same results were obtained when mone- 
tary gains were paired with an aversive 
stimulation (Bem, Wallach, & Kogan, 1965). 


* Lonergan and McClintock (1961), who had 


hypothesized a shift toward increased con- 
servatism, also found a consistent increase 
in risk taking in interdependent groups, al- 
though this trend apparently did not reach 
acceptable levels of statistical significance 
(p < .10). In a very recent experiment Wal- 
lach and Kogan (1965) attempted to sepa- 


-rate the effects of group discussion and 


group consensus on risk taking. They found 
that the shift toward increased risk taking 
which resulted from group discussion alone 
was as large as that produced by discussion 

iThis study was supported by a general research 


support grant from the National Institutes of Health, 
United States Public Health Service. 


and consensus. Consensus alone produced no 
effect. The authors concluded that "group 
discussion provides the necessary and suffi- 
cient condition for generating .the risky shift 
effect [p. 17].” 

In their explanation of the above results 
Wallach et al. (1964) repeatedly conjecture 
that the group situation produces increased 
risk taking because it permits the diffusion 
of responsibility. Since the entire group par- 
ticipates in the decision-making process, each 
member shares his responsibility with the 
group. Such spreading of responsibility al- 
lows for the choice of higher risk levels 
and the increased probability of failure asso- 
ciated with it (p. 273). It would seem that 
the principle of “responsibility diffusion” 
would have been more plausible had group 
consensus rather than group discussion been 
the critical determinant of increased risk 
taking, since a group can be held collectively 
responsible only for a decision reached but 
not for the process of discussion preceding 
the decision. In their defense of the concept 
of responsibility diffusion in relation to group 
discussion Wallach and Kogan (1965) find 
it necessary to resort to an additional, non- 
cognitive explanation of the results by stating 
that the group discussion produces affective 
ties among its members which enable each 
to feel less blameworthy in case of failure, 
following a risky decision (p. 19). 

In evaluating the noncognitive solution to 
the problem of group-induced risk taking, 
one must take into account the fact that the 
subjects were randomly assigned to the groups 
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(Wallach & Kogan, 1965, p. 5) and that the 
somewhat artificial atmosphere of an experi- 
ment, coupled with the temporary formation 
of a group, leaves little possibility for the 
establishment of strong affective ties among 
the subjects. Instead, the present study 
wishes to suggest a somewhat different possi- 
bility which may account, at least in part, 
for the observed increased risk taking which 
follows group discussion; the possibility that 
group discussion may produce a shift in 
orientation away from failure avoidance and 
toward the maximization of gain. This shift 
in orientation may be accompanied by a 
tendency to take greater risks. The above 
explanation has the advantage of simplicity, 
making it unnecessary to refer to the theorem 
of responsibility diffusion. The latter, while 
intuitively appealing, is a very complex ex- 
planation of risk taking since it implies that 
the taking of responsibility in a group is a 
sharable process which reduces the responsi- 
bility of each member. 

The present study is designed to relate 
predictive judgments of unethical behavior 
to different external conditions of judgment, 
predominantly that of group discussion. The 
specific hypothesis tested is that group dis- 
cussion will produce a shift in predictive 
judgments of unethical behavior which is 
similar to that produced by varying other 
sources of risk, namely, privacy and im- 
personality of judgment. That is, the oc- 
currence of unethical behavior would (be 
predicted to) be more frequent if such judg- 
ments predict the behavior of a hypothetical 
person rather than the behavior of the sub- 
ject himself, and when the judgments are 
expected to remain private rather than be- 
come public, since self-disclosure and pub- 
licity about unethical behavior carry a greater 
likelihood of censure. Conversely, imperson- 
ality and , privacy about such predictions 
represent judgment conditions of low risk. 
If group discussion produces greater risk 
taking, it should show an effect similar to 
that obtained by making the predictive judg- 
ments impersonal and private. 

While predictive judgments about the oc- 
currence of unethical behavior are not the 
equivalent of actual risk-taking behavior, the 
multidimensional scale designed to measure 
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such judgments has been shown to predict 
the actual unethical behavior of the judges 
(Rettig & Pasamanick, 1964; Rettig & 
Sinha, 1965). In these experiments it became 
evident that the intervening variable between 
predictive judgment and behavior was sensi- 
tivity to social censure. Subjects who took 
behavioral risks (e.g. of cheating in order 
to increase monetary gains) were found to 
make greater discrimination in judgment be- 
tween matched items portraying unethical 
behavior differing only in the negative rein- 
forcement value of censure (RV) than 
subjects who did not take such risks. Match- 
ing the same stimulus items in accordance 
with other built-in determinants of unethical 
behavior, such as the expectancy (E,n) and 
the reinforcement value of gain (RVgn) or 
the expectancy of censure (Es), produced 
no results. The above relationship between 
unethical (risk-taking) behavior and im- 
personal predictive judgments was obtained 
despite the interval of 1 year between be- 
havior and judgment. The results were ex- 
plained in terms of differential learning, in 
that subjects motivated to take unethical 
risks apparently have learned to be generally 
more sensitive about censure since it had 
served as a (negative) reinforcer during pre- 
vious behavior. Subjects not motivated to 
take such risks are less sensitive to censure 
since they have not undergone a similar 
learning process. 

In these and other studies it was also 
consistently shown that the RVeens component 
of the stimulus items explained more variance 
in predictive judgments than the remaining 
built-in determinants (Rettig, 1964; Rettig 
& Pasamanick, 1964; Rettig & Rawson, 
1963). However, all of these studies were 
conducted under individual judgment con- 
ditions. As will be shown later, group judg- 
ment conditions apparently create a difference 
in orientation, emphasizing the reinforcement 
value of gain (RV,,) more than the rein- 
forcement value of censure (RVeens)+ 


METHOD 


Behavior Prediction Scale 


The basic structure of the multidimensional scale 
was described in a previous report (Rettig & Raw- 
son, 1963). Briefly, the scale consists of a series of 
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stimulus items, each portraying a person (either 
hypothetical or the subject himself) in conflict about 
stealing money from a bank. Each item presents 
four determinants (the expectancy and reinforce- 
ment value of gain, and the expectancy and rein- 
forcement of censure) in the same sequence, ran- 
domly varying the levels of any one determinant 
(high or low) from item to item. This combination 
of determinants and levels required 16 (24) stimulus 
items, Subjects were requested to predict on a scale 
ranging from 0 (definitely not) to 6 (definitely 
yes) whether or not the money would be taken. The 
high and low levels of the determinants were pre- 
sented as follows: 


1. Reinforcement value of gain (RVs): high— 
the money is needed for a crucial medical opera- 
tion; low—the money is needed by other people. 

2. Expectancy of gain (Eyn): high—the medical 
operation was guaranteed to cure the illness, the 
money obtainable would help many people; low— 
the success of the operation was not guaranteed, 
the money obtainable would help only very few 
people. 

3. Negative reinforcement value of censure 
(RVecons): high—the theft would result in expul- 
sion from the bank and charge of criminal con- 
duct; low—the theft would be settled in private 
with the bank president. 

4. Expectancy of censure (Econs): high—the 

| theft would be detected; low—the theft would go 
unnoticed. 


Two forms of the scale were constructed, a personal 
form on which the subjects predict their own be- 
havior, and an impersonal form on which the sub- 
jects predict the behavior of a hypothetical person. 
The following are two matched sample items, one 
from the personal and the other from the impersonal 
form of the scale. 


[Personal] Assuming you are a bank employee 
in urgent need of a large sum of money for a 
crucial medical operation you need, You are 
thinking of stealing the money from the bank. 
The operating surgeon could not give you any 
guarantee that the operation would cure your 
illness. You are certain that your theft would be 
detected sooner or later. However you are con- 
vinced that if you are caught you would settle 
the matter privately with the bank president. 

[Impersonal] A bank employee was in urgent 
need of a large sum of money for a crucial 
medical operation he needed. The employee is 
thinking of stealing the money from the bank. The 
operating surgeon could not give the employee any 
guarantee that the operation would cure his illness. 
The employee was certain that his theft would be 
detected sooner or later. However, the employee 
was convinced that if he was caught he would 
settle the matter privately with the bank president. 


The instructions accompanying the scale emphasized 
that subjects were to predict only whether or not 
the money would be taken, not to judge how wrong 
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it would be to take it. Separate Kuder-Richardson 
reliability estimates were found to be .94 for the 
individual condition of judgment, and .90 for the 
group condition. 


Subjects 


The subjects were 160 undergraduate students 
equally distributed by sex and within the various 
judgment conditions. All subjects were seen in their 
original classrooms, with entire class sections par- 
ticipating in the study. Fourteen different sections 
were tested. The equal distribution of subjects 
within the various cells was accomplished after the 
ied with the aid of a table of random num- 

TS. 


Individual versus Group Condition 


In the individual condition of judgment, subjects 
were administered the scale under ordinary class- 
room procedure. Both forms of the questionnaire 
were administered at the same time in such a fashion 
that adjacent students received different forms. In 
the group condition subjects were asked to cluster 
themselves into three or four member groups, fac- 
ing each other. Subjects were then instructed to 
read each question and to discuss it aloud with the 
other members in their group. Following discussion 
each subject made his own judgment; group con- 
sensus was not required. Adjacent groups received 
different forms of the écale. 


Public versus Private Condition 


In the public judgment condition subjects were 
informed in the beginning that upon completion of 
the questionnaire they would exchange their copy 
with that of another student. The other student 
would be one of the group members in the group 
condition of judgment, or another classmate in the 
individual condition of judgment, In the private 
judgment condition no instructions for the exchange 
of questionnaires were given. 


RESULTS 


As seen in Table 1 the group condition 
and the impersonal condition of judgment 
produce the expected effect of increased pre- 
diction of the occurrence of the unethical 
behavior. While the effect produced by group 
discussion alone reaches an acceptable level 
of significance (F = 5.06, df 21/14, p< 
.05), and the effect due to the impersonal 
condition of judgment is similar (F = 50.40, 
df = 1/144, p< .001), both sources of risk 
combine to produce a significant interaction 
(F=6.82, df = 1/144, p< 01). That is, 
the group effect is significantly more pro- 
nounced when the judgments are personal 
than when they are impersonal. The effect 
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TABLE 1 


SCALE MEANS AND STANDARD DEVIATIONS 
OF PREDICTIVE JUDGMENTS FOR Low- AND 
Hicu-Risk CONDITIONS 


Condition M SD 
Low risk 

Group 35.36 16.40 

Impersonal 41.05 15.39 

Private 32.10 15.96 
High risk 

Individual 30.09 18.10 

Personal 24.40 15.28 

Public 33.35 18.85 


produced by publicity does not reach sta- 
tistical significance. However, its interaction 
with personal conditions of judgment does 
reach statistical significance (F — 5.40, df — 
1/144, p < .05), showing the expected effect 
of publicity only for personal judgments. 
For impersonal ‘judgments the effect of pub- 
licity is reversed, increasing rather than de- 
creasing the prediction. There were not sig- 
nificant differences in relation to sex. These 
findings support the hypothesis that group 
discussion produces a shift toward increased 
risk taking in judgments. 

An inspection of the internal determinants 
of the predictive judgments, especially in 
their relationship to the two more critical 
sources of variation in risk (group discussion 
and personal judgments), provides further 
clues about the decision-making process 
underlying the judgments (Table 2). While 
personal judgment conditions significantly 
affect all internal determinants (p < .001) 
except for RV,n, the group condition of judg- 
ment affects only RV&4, (F = 16.82, df= 
1/144, p < .001). These findings indicate 
that while the overall effects are similar for 
both conditions of risk, the judgmental proc- 
ess itself apparently differs. 


TABLE 2 


MEAN DIFFERENCE SCORE OF PREDICTIVE 
Jopements By RISK CONDITION 
AND ITEM COMPONENT 


Condition Egan RVen Econs R Veens 


Group 6.79 12.94 6.74 9.6 
Individual 6.86 7.36 6.69 999 


Impersonal 8.35 10.60 8.93 


11. 
Personal 5.30 9.70 798 


4.50 7.95 


SALOMON RETTIG 


Discussion 


Group discussion and impersonal conditions 
of judgment produce increased risk taking, 
as evidenced by a greater anticipation of 
the occurrence of unethical behavior. Privacy 
in judgment shows a similar effect only if 
the judgments are personal ones, referring 
to the subject’s own behavior. These results 
were predicted from the general notion that 
low conditions of risk make for increased 
risk taking, That is, judgments of unethical 
behavior made in private rather than in 
public, and referring to a hypothetical per- 
son rather than to the subject himself are 
less likely to be censured and thus represent 
low-risk conditions of judgment. More im- 
portant, previous studies have shown that 
group discussion also lowers the conditions 
of risk. The latter is supported by the pres- 
ent study, since the effect produced by group 
discussion was found to be similar to the 
effects produced by varying the other condi- 
tions of risk. 

However certain important differences must 
be noted in comparing the present study 
to previous studies of group effects on risk 
taking. In the previous studies the higher 
risk choices were also choices carrying higher 
reinforcement values of gain, monetary and 
social ones. Since choice of the higher re- 
inforcement values of gain was not experi- 
mentally separated from the selection of the 
higher levels of risk, it was not possible to 
observe the effect of one upon the other. 
In the present study the higher risk judg- 
ments are socially undesirable (unethical) 
ones, and carry no higher monetary incen- 
tives for the subjects. Furthermore, the re- 
inforcement value of gain is only one of four 
different considerations affecting the judg- 
ments. This design permits not only the 
isolation of the reinforcement value of gain 
and the measurement of its effect on the 
selection of higher levels of risk, but also 
the evaluation of the effect of the remaining 
internal determinants on risk preference in 
judgment, especially in their interaction with 
the external sources of variation in risk. It 
is here that the present study clearly points 
to the reinforcement values of gain as being 
largely responsible for the observed differ- 
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ences of risk taking in judgments between 
individual and group conditions. 

Precisely why group discussion should pro- 
duce an increased emphasis on the reinforce- 
ment value of gain is not entirely clear. 
Perhaps group discussion brings about a re- 
duction in anticipated censure (or anxiety) 
for the participating group members since 
the discussion provides each member with 
the opportunity of testing the approval or 
disapproval of the other group members and 
adjust his responses accordingly. In other 
words, the process of communication set in 
motion by the group discussion may result 
in a lowered expectancy of being censured 
and a concomitantly greater feeling of se- 
curity. The resulting reduction in anticipated 
censure would, in turn, permit preference of 


' a more “offensive” strategy (maximization 


of gain) over a “defensive” one (avoidance 
of censure). Here one might also expect a 
reduced emphasis on RVeens in the group 
condition of judgment. While RVeens did show 


- a greater drop than the other components, the 


shift does not attain statistical significance 
probably because of the greater “end effect” in 
the individual judgment condition. Such an 
effect tends to artificially restrict variability 
in judgments. (Probably for the same reason 
RVeens is also restricted in personal judg- 
ment conditions.) 

While the above conjecture is highly specu- 
lative at this time, it might offer an ex- 
planation as to why group discussion rather 
than group consensus is the critical ante- 
cedent of increased risk taking observed in 
groups. Consensus alone, in skipping the 
process of censure testing and subsequent 
adjustment of one's own position, may tend 
to increase rather than decrease the anxiety 
of the participating subjects. 

Since riskiness in judgment is not the 


equivalent of risk taking in other forms of 


behavior, even if these judgments are pre- 
dictions by a subject about himself, the 
present findings must be considered explora- 
tory. Here it is also important to realize that 
the levels which are selected to represent 
the various components of the stimulus items 
are chosen arbitrarily. Although the present 
findings in the individual condition of judg- 
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ment are in agreement with the results of 
previous studies in which somewhat different 
scales were used, the consistency in the re- 
sults may partially reflect a common experi- 
menter bias in the selection of the component 
levels. However, despite the possibility for 
such common experimenter bias, the group 
condition of judgment produced results which 
were clearly different from those consistently 
shown in individual judgment conditions. In 
individual judgment conditions the negative 
reinforcement value of censure was found to 
be the most important determinant of pre- 
dictive judgments of unethical behavior as 
well as of actual unethical behavior. In the 
group judgment condition the reinforcement 
value of gain is found to be the most im- 
portant determinant of predictive judgment. 
Whether under group conditions the rein- 
forcement value of gain will also prove to 
be the most important determinant of actual 
unethical behavior still remains to be tested. 
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2 experiments were conducted. In the Ist experiment an attempt was made to 
establish conditioning to visual stimuli presented below the awareness threshold. 
No evidence of conditioning was obtained. However, during a subsequent 
supraliminal test period, analysis of GSR records indicated that generalization 
to semantically related material had occurred. In the 2nd experiment condition- 
ing procedures were applied to stimuli presented just below the recognition 
threshold. In this case conditioning was successfully established. Semantic gen- 
eralization was also observed during the subsequent test period. In neither 
experiment was there any evidence of structural generalization. Control ex- 
periments indicated that in the absence of experimental manipulation there was 
no evidence that Ss were differentially sensitive to the various words employed 


in the test sessions. 


Baker (1938) reported a study where an 
attempt was made to establish conditioned 
pupillary responses to a tone presented below 
the awareness threshold. This author claimed 
that not only could conditioned responses be 
obtained under such conditions, but also that 
these responses were established more quickly 
and were more resistant to extinction than 
were conditioned responses established to 
supraliminal tones. Similar results were 
reported by Metzner and Baker (1939). 

These results appeared exceedingly inter- 
esting although somewhat surprising. How- 
ever, Wedell, Taylor, and Skolnick (1940) 
and Hilgard, Miller, and Ohlson (1941) inde- 
pendently attempted to replicate them with- 
out success. Eriksen (1960) has previously 
made the point that other workers, specifi- 
cally Steckle and Renshaw (1934) and Hil- 
gard, Dutton, and Helmick (1949), experi- 
enced considerable difficulty in establishing 
conditioned pupillary responses to supra- 
threshold tones. 


1The study reported in this paper was conducted 
at University College London as part of a research 
program undertaken for the PhD degree in the 
University of London under the general supervision 
of N. F. Dixon. The author is indebted to the 
University of Queensland for financial support while 
this program was carried out. 

? Now in the Department of Psychology, Monash 
University, Clayton, Victoria, Australia. 


Interest in the problem of establishing 
conditioned responses to stimuli of which the 
subject was unaware waned somewhat during 
the 1940s and early 1950s as those workers 
interested in discrimination without aware- 
ness concerned themselves mainly with prob- 
lems of perceptual defense and subception. 
It was not until experiments by Wilcott 
(1953) and Taylor (1953) that interest in 
subthreshold conditioning was rekindled. 
Wilcott, using GSR as his response index, 
attempted to demonstrate conditioning uti- 
lizing shock as the unconditioned stimulus 
(UCS) and tones similar to those used by 
Baker as his conditioned stimuli. No evidence 
of GSR conditioning was demonstrated. 
Taylor's study, however, did produce signifi- 
cant results, Taylor utilized six geometrical 
figures as stimuli. During the conditioning 
session, these stimuli were presented just be- 
low the recognition threshold, and certain of 
them received reinforcement (shock). Taylor 
claimed that he had been able to demon- 
strate that subthreshold discrimination had 
occurred. Eriksen (1960) argued that Tay- 
lor's claim was somewhat premature and that 
although subjects may have been unable to 
identify the various stimuli presented they 
could have been responding to suprathreshold 
cues which could have formed the basis for 
the discrimination, This explanation was very 
similar to the partial recogntion explanation 


634 


GENERALIZATION AND SUBLIMINAL STIMULI 


advanced by Bricker and Chapanis (1953) 
in their criticism of the now classical “sub- 
ception" study by Lazarus and McCleary 
(1951). 

It is this author's contention that any 
experiment which utilizes standard condition- 
ing procedures and relies upon differential 
responsitivity to reinforced and nonreinforced 
stimuli would be subject to the powerful ob- 
jections raised by Eriksen and by Bricker 
and Chapanis. In the experiments to be de- 
scribed in this paper the reinforced stimuli 
were not employed in the test period subse- 
quent to conditioning. Instead, stimuli seman- 
tically related to the reinforced stimuli were 
used as the critical stimuli. Stimuli structur- 
ally similar to the reinforced stimuli were 
also included in the test-period stimulus 
array. These stimuli were included to deter- 
mine the role of partial recognition in 
response determination. 


Research Problem 


Tt is the aim of these experiments to deter- 
mine if conditioning procedures applied to 
subliminally presented stimuli have any sys- 
tematic effects on the magnitudes of changes 
in GSR following supraliminal presentations 
of various stimuli subsequent to completion 
of the conditioning period. It is proposed to 
examine the relative power of semantic and 
structural factors in response determination. 


METHOD 


Subjects 


A total of 16 subjects, 8 male and 8 female, were 
used in each of the major experiments described 
here. All were students of University College Lon- 
don. Psychology students were excluded. Each sub- 
ject was paid a fee of two shillings and sixpence 
(approximately $.40). An additional 48 subjects were 
used in four control studies, These subjects were 
drawn from a similar population. 


Apparatus 


The apparatus used for stimulus presentatioh was 
similar in many respects to that used in previous 
studies by Baker and Feldman (1956) and by 
Zajonc and Nieuwenhuyse (1964). The stimulus 
image was projected onto a 2-inch square frosted 
screen. The intensity of the image was adjustable 
by manipulations of a control knob connected to 
two polaroid discs positioned between the projector 
and the screen. These manipulations could be exe- 
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cuted by either the experimenter or the subject. 
Variations in the positions of the polaroid discs were 
recorded on a dial calibrated in degrees. A 5-degree 
change reflected an illumination change of .10 log 
footlambert, A vertical neutral density filter was 
also positioned between the projector and the 
screen. This filter was rarely adjusted, but its purpose 
was to allow the experimenter to adjust for extreme 
individual differences such that the range of the 
polaroid filter was appropriate for each subject. An 
adjustment of 1 inch in the position of this wedge 
induced an intensity change of .10 log footlambert. 

A 12-volt power supply was employed to deliver 
the UCS (shock) to the subject. An inductorium 
was used, The shocker was connected to subjects 
by means of two dry-silver electrodes set into an 
armband and placed around the subject’s left 
forearm. 

The apparatus used to measure GSR incorporated 
a circuit similar to that described by Nichols and 
Daroge (1955). This apparatus recorded percentage 
of change in resistance directly on a large dial with 
a range of 25%, It was intended to feed this data 
directly onto a pen recorder, but the pen recorders 
were found to be subject to distortion due to inter- 
ference from electrical apparatus located nearby. It 
was therefore necessary to resort to experimenter 
recording. To increase the accuracy of experimenter 
recording of small changes in resistance the GSR 
information was fed into a multiplier such that 
a change of 1% was represented by 12.5 on the 
multiplier scale. The GSR apparatus was connected 
to the subject by means of two dry-silver electrodes 
embedded in a band strapped around the subject’s 
right hand. The whole hand was encased in a padded 
clamp anchored to a table. Thus the right hand 
was kept immobile while GSRs were recorded. 

The subject and experimenter were separated by a 
hardboard partition. 


Procedure 


Experiment I. The experiment was divided into 
three stages, These were: Stage 1, threshold testing 
period; Stage 2, a conditioning period; Stage 3, a 
test period, 

Stage 1: A total of 12 trials was given. The 
stimuli used were the words ABOUT, HEARD, and MAN. 
These words are shown in the Thorndike and Lorge 
(1944) list as having frequencies equal to or greater 
than the frequencies of the stimuli used during the 
conditioning period. Each stimulus was presented 
four times, On two trials the level of illumination 
at the beginning of the trial was quite high and the 
subject could recognize the stimuli. The ambient 
illumination of the screen throughout this experi- 
ment and indeed throughout all the studies reported 
here was 25 footcandles. The subject was required 
to reduce the brightness by manipulations of the 
control knob until he could not see anything on the 
screen. On the other two trials the level of illumina- 
tion at the start of the trial was well below the 
subject’s awareness threshold, and he was required 
to increase the level of illumination until he could 
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just report awareness of the presence of a stimulus 
on the screen. The order of presentation of the 
stimuli and the alternative starting positions were 
randomly determined. The criterion of awareness 
threshold was the lowest illumination at which the 
subject reported awareness during the 12 trials. It 
should be noted here that the method of threshold 
determination resembled that used by Dixon (1958) 
and was not the forced-choice procedure advocated 
by Eriksen (1960) and used by Fuhrer and Eriksen 
(1960). In the opinion of this writer the forced- 
choice method of threshold determination is inap- 
propriate for this type of situation. The forced- 
choice procedure would require the subject to guess 
as to whether or not a stimulus was present on any 
trial. Such a procedure, as Eriksen points out, does 
produce lower thresholds, but at the same time by 
its very nature almost completely eliminates the pos- 
sibility of the subsequent demonstration of sub- 
liminal perception. Better than chance guessing 
may conceivably indicate that the subject's response 
behavior is being modified by the presence of 
"subliminal" stimuli. 

Stage 2: The subject was connected to the shocker 
and to the GSR. His pain threshold was determined. 
This level of shock was used throughout the condi- 
tioning period. He was told that during this period 
he would probably not be able to see anything on 
the screen but that stimuli would be presented. He 
was also told that certain of the stimuli would some- 
times be followed by electric) shock. He was re- 
quested to report immediately if he became aware 
that a stimulus was being presented. 

Six stimuli were presented in randomized groups. 
These were the words CUP, KNIFE, SPOON, SUIT, 
LEARN, and FAULT. The first three (all kitchen 
utensils) were reinforced on two out of every three 
Occasions they were presented. The trials on which 
each of these words was reinforced were randomly 
determined. Just prior to each stimulus presentation 
the experimenter said the word “Now.” Partial 
reinforcement was used because of its greater resist- 
ance to extinction (Jenkins & Stanley, 1950). The 
stimuli were presented for 2 seconds at a level of 
illumination .1 log footlambert below the awareness 
threshold determined in Stage 1. The interstimulus 
interval was approximately 20 seconds. Since an 
inductorium was used, duration of shock was mo- 
mentary. Shock was applied .5 second after stimulus 
onset. 

Conditioning was continued until 20 reinforce- 
ments had been applied to each reinforced stimulus; 
that is, there were 60 reinforcements in all. 

Two subjects reported awareness of light on the 
screen during the conditioning period. The experi- 
ment was terminated in each case. The 16 subjects 
on whom the results of this experiment are based 
failed to report awareness. 

Stage 3: During this period the words FORK, 
PLATE, SPOOL, KNAVE, GRASS, TREE, ADVICE, and ISSUE 
were used as stimuli. Two of these were semantically 
related to the reinforced stimuli, two structurally re- 
lated to the reinforced stimuli, two semantically re- 
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lated to each other but not to the reinforced stimuli, 
and two unrelated either to each other or to the rein- 
forced material. The level of illumination used en- 
abled subjects to recognize the stimuli without dif- 
ficulty. The subject was requested not to identify 
verbally the words since such verbal responses could 
have contaminated the GSRs. He was given the 
impression that reinforcement would continue. Each 
stimulus was presented twice. The interstimulus 
interval was again approximately 20 seconds. 
Counterbalanced random orders were used to control 
for order and extinction effects. To eliminate any 
possible experimenter bias in the recording of results 
each stimulus was labeled by an independent person, 
and until the completion of the study the experi- 
menter did not know which stimulus was being 
presented on any trial. 

At the completion of the test period subjects were 
shown a typewritten list containing the eight stimu- 
lus words and asked to indicate two which they 
thought might have been associated with shock 
during the conditioning period. 

Experiment II. This experiment was basically very 
similar to the one described above. In this case the 
conditioning was applied to stimuli presented at a 
level of stimulation .2 log footlambert below recogni- 
tion threshold instead of .1 log footlambert below 
awareness threshold. In addition the test stimuli in- 
cluded a word semantically related to one of the 
nonreinforced words used in the conditioning situa- 
tion. 

Stage 1: The procedure used here was identical to 
that used in Experiment I except that it was the 
recognition threshold rather than the awareness 
threshold which was under consideration. Stimuli 
used were the words ABOUT, HEARD, and ADVICE. 

Stage 2: Again the procedure used was similar to 
that of Experiment I except for the illumination 
level and for stimulus presentation. Stimuli used 
were the words SPOON, GRASS, LEARN, and FAULT. The 
stimulus spoon was reinforced on two-thirds of the 
occasions it was presented. There were 40 reinforce- 
ments. Subjects were requested to inform the ex- 
perimenter immediately if they recognized a word. 
Five subjects did in fact recognize a reinforced 
stimulus on a reinforced trial and were discarded. 
All 16 subjects on whom the results are based 
failed to recognize any stimuli during the condition- 
ing period. 

Stage 3: Stimuli used were the words PLATE, TREE, 
SPOOL, and IssuE. One of these was semantically re- 
lated to the reinforced stimulus, one semantically re- 
lated to a nonreinforced stimulus, one structurally 
similar to the reinforced stimulus, and one not re- 
lated to any of the stimuli used in the conditioning 
period. Stimuli were presented at a level of illumina- 
tion at which subjects could identify easily the 
stimulus words. The change in the stimulus array 
between Experiment I and Experiment II was made 
to determine if responses to a stimulus semantically 
related to a nonreinforced word present in the con- 
ditioning period stimulus array were significantly 
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different from those to a word semantically related 
to a reinforced stimulus. 

At the completion of the test period subjects were 
asked to choose which of the four test stimuli was 
the most likely to have been associated with shock 
during the conditioning period. 

Control studies. Four control experiments were 
performed, These experiments were necessitated by 
the possibility that the words used in the two test 
periods may have had differential capacities to elicit 
GSRs independently of the selective reinforcement 
employed during the conditioning periods, 

Two experiments, each using 16 subjects, were run 
under conditions identical to those used in the test 
periods of the two major experiments. These sub- 
jects did not undergo a threshold-testing period or 
a conditioning period. 

Two further experiments, each using eight sub- 
jects were conducted. In these studies subjects rep- 
licated the entire experimental procedures of each 
of the major studies except that reinforcements, 
although given, were not associated with specific 
stimuli, It was considered these latter two control 
experiments were necessary to check on the very 
slight possibility that the reinforcements may have 
had different sensitizing effects on the various stim- 
uli used in the test periods. 


5 RESULTS 
Experiment I 


There was no evidence whatsoever to sug- 
gest that conditioning had been established 
during Stage 2. For each subject mean per- 
centage of changes were calculated for the 
first and last six unreinforced presentations 
of the reinforced and neutral stimuli. These 
mean percentages of change were based on 
changes in GSR occurring between 1 and 4 
seconds after stimulus onset. Preliminary 
work with this apparatus indicated that GSRs 
occurring during this time period were likely 
to be conditioned responses. A sign test 
showed that there was no evidence to suggest 
that responses to the reinforced stimuli were 
larger at the end of conditioning than they 
were at the beginning. Neither was there 


any difference between responses to reinforced 
and neutral stimuli at the end of the condi- 
tioning period. 

The mean percentages of change in GSRs 
for the various words and for the stimulus 
categories during the supraliminal test pe- 
riod are given in Table 1. Once again these 
means are based on changes in GSR occur- 
ring between 1 and 4 seconds after stimulus 
onset. The words were classified in the fol- 
lowing manner: Category 1, stimuli semanti- 
cally related to the reinforced stimuli; Cate- 
gory 2, stimuli structurally related to the 
reinforced stimuli; Category 3, stimuli seman- 
tically related to each other but not to the 
reinforced stimuli; Category 4, stimuli un- 
related to each other or to the reinforced 
stimuli. 

To ensure that the data satisfied the con- 
ditions for the use of the analysis of variance 
technique, particularly normality and homo- 
geneity of variance, it was considered de- 
sirable to transform the data prior to analysis. 
A logarithmic transformation similar to that 
suggested by Fisher (1954) was used. Since 
the data included “a number of zero entries, 
the transformation log (1-- X) was used. 
Table 2 details the result of the analysis of 
variance. 

The particular analysis of variance used 
here involved a simple two-way Stimuli x 
Subjects design with subsequent subclassi- 
fication of stimuli into the various categories. 
This design seems superior to the alternative 
Stimulus Categories X Subjects design since 
there is no prior guarantee that the stimuli 
are homogeneous within categories. Had the 
stimuli within-groups variance estimate pro- 
duced an F with p < .05 it would have been 
used as the error term for the comparison 
between word groups. It is self-evident that 
such a procedure would provide greater pro- 
tection again a Type I error. 


TABLE 1 
MEAN PERCENTAGE OF CHANGES IN GSR FOR THE Four STIMULUS CATEGORIES 
Category 1 Category 2 Category 3 Category 4 
Means 
FORK PLATE SPOOL KNAVE GRASS TREE ADVICE ISSUE 
Word 1.54 1.81 0.60 0.72 0.58 0.90 0.86 0.86 
Group 1.68 .66 JA .86 
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TABLE 2 
EXPERIMENT I: ANALYSIS OF VARIANCE 
CHANGES IN GSR-Test PERIOD 
Source Suo df. | ene |. p 
Words (between groups) 1.3319 | 3 | .4440 |21.24* 
Words (within EUN 0.12 4| .0321 | 1.54 
Subjects 2.0849 | 15 
Words X Subjects 2.6934 | 105 | .0257 | 1.23 
Error 2.6773 | 128 | .0209 
Total 8.9158 | 255 
*p <.001. 


To determine if subjects could consciously 
discriminate between the test words with 
regard to shock associations an analysis of 
the verbal choices made at the conclusion 
of the experiment was performed. The fre- 
quencies of these choices for each stimulus 
category are giyen in Table 3. 


Experiment II 


There was evidence of conditioning in Ex- 
periment II. During the last six unreinforced 
trials of the shocked stimuli 12 of the 16 
subjects showed larger G3Rs than were ob- 
tained for the first six unreinforced trials 
(p = .038, one-tailed). In addition differences 
between mean GSRs to reinforced and neu- 
tral stimuli during the final six trials of the 
conditioning period were significant (¢= 
4.50, df = 15, p < .001). 

Mean GSRs to the various stimuli pre- 
sented during the supraliminal test period 
are given as follows: PLATE, 1.29; SPOOL, .78; 
TREE, 1.00; ISSUE, .73. A summary of the 
analysis of variance performed on the data 
yielding these means is given in Table 4. 

Since responses to the word TREE (seman- 
tically related to a nonreinforced stimulus) 
were larger than responses to all stimuli with 
the exception of PLATE, it was considered de- 
sirable to check on the significance of the 

TABLE 3 


EXPERIMENT I: VERBAL CHOICES AS TO WHICH OF TEST 
STIMULI WERE ASSOCIATED WITH SHOCK 


Category 1 | Category 2 | Category 3 | Category 4 


Observed 10 8 5 9 
Expected 8 8 8 8 


Note.—x? = 1.75, df = 3, ns. 


A. G. WORTHINGTON 


difference between PLATE and TREE. Other 
comparisons orthogonal to the one of interest 
were also performed. Results of these anal- 
yses are detailed as follows: PLATE versus 
TREE—F = 8.52, df = 1/192, p < .01; spoon 
versus ISSUE—F < 1, df = 1/192, ns; PLATE 
and TREE versus SPOOL and IssuUE—F = 30.42, 
df = 1/192, p < .001. 

Because of the small numbers involved it 
was not possible to analyze the verbal choices 
made by subjects when presented with the 
four test stimuli and asked to choose which 
stimulus they thought might have been re- 
inforced during the conditioning period. There 
was, however, no indication that there was 
any relation between the autonomic behavior 
and the verbal choice behavior. Four subjects 
chose PLATE; six, SPOOL; four, TREE; and 
two, ISSUE. 


TABLE 4 


EXPERIMENT II: ANALYSIS OF VARIANCE 
CHANGES IN GSRs-TEsT PERIOD 


Source ss pane x 
Words 0.6082 3 | .2027 | 12.99* 
Subjects 3.0293 15 
Words X Subjects| 0.6984 45 | .0155 0.99 
Error 2.9966 | 192 .0156 

Total 7.3325 | 255 
*p «001. 


Analysis of the data from the four con- 
trol experiments failed to indicate either 
differences between words or significant Sub- 
jects X Words interactions. For all four ex- 
periments the between-words F was « 1. In 
only one case was the Subjects x Words inter- 
action > 1 (F = 1.17, df = 105/108, ms). 


Discussion 


The results presented here provide fairly 
strong evidence to suggest that some form 
of mediated discrimination occurred during 
the subliminal conditioning periods. The evi- 
dence for semantic generalization was equally 
strong in both experiments even though there 
was no evidence of conditioning under the 
conditions of Experiment I. In neither study 
was there any evidence to suggest that struc- 
tural generalization occurred. Absence of such 
generalization would indicate that in this 
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GENERALIZATION AND SUBLIMINAL STIMULI 


. study the partial recognition hypothesis does 


not receive support. 

The results of the first two control experi- 
ments suggest that the observed differences 
in GSRs to the various test stimuli cannot 
be accounted for by underlying differences 
in the capacities of the various words to 
elicit GSRs. The results of the third and 
fourth control experiments suggest that shock, 
without association with specific stimuli, does 
not systematically affect the capacity of the 
various test stimuli to elicit GSRs. 

Perhaps the most interesting finding of 
these studies was the observation that al- 
though conditioning per se did not occur 
in Experiment I semantic generalization did 
occur in the test period subsequent to condi- 
tioning. The reasons for this surprising result 


` are unclear, but it is likely the explanation 


lies in the nature of thresholds themselves. 
Since threshold is a statistical concept, we 
might expect changes in sensitivity from 
trial to trial. While at no time did this sensi- 


. tivity improve sufficiently for subjects to 


report, awareness, there surely is little doubt 
that at times the level of presentation was 
closer to awareness than at others. It is sug- 


+ gested that on such trials some information 


must have gotten through to form the basis 
for the subsequently observed mediated gen- 
eralization. Scrutiny of GSR records indicated 
that this may have been the case as there 
were, at times, fairly large responses to un- 
reinforced presentations of the critical words. 
Such responses to neutral words occurred less 
frequently. Also it is known, and I have 


` personally shown this in a separate study 


(Worthington, 1962), that shock occurring 
during stimulus presentation has the effect 
of temporarily lowering thresholds. On re- 
inforced trials this lowering of thresholds 


. may have been sufficient to permit informa- 


tion to be received. On the unreinforced 
trials, however, in the absence of shock, 


. there would be no reason to expect much 


information transmission and hence no sys- 
tematic changes in GSRs. ~ 

The evidence for conditioning in Experi- 
ment II provides support for Taylor’s (1953) 
contention that conditioning of stimuli pre- 
sented below the recognition threshold may 
occur. However, because of the subsequent 
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demonstration of semantic generalization and 
the absence of structural generalization the 
objections raised by Eriksen (1960) to Tay- 
lor’s study would seem less appropriate here. 
Further evidence of the importance of se- 
mantic factors in responsitivity to subliminal 
stimuli can be gleaned by reference to the 
results of Experiment II where it may be 
observed that responses to the word TREE 
(semantically related to a conditioning pe- 
riod nonreinforced stimulus) were larger 
than those to the structurally similar and 
neutral stimuli. 

A further point which should be made in 
connection with these studies is that in 
neither was there evidence of any conscious 
discrimination. When subjects were asked to 
nominate which of the test stimuli they felt 
might have been associated with the shocked 
stimuli during conditioning, their responses 
were at about chance level. 
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ERRATUM 


On pages 346 and 347 of the article, “Expectations of Social Acceptance and 
Compatibility as Related to Status Discrepancy and Social Motives,” by Robert B. 
Bechtel and Howard M. Rosenfeld (Journal of Personality and Social Psychology, 
1966, 3, 344—349), Figures 1, 2, and 3 should appear as follows: 
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Ss classified as coming from entrepreneurial or bureaucratic families were given 
the Rotter Level of Aspiration (LOA) Board and bargained in a 2-person, 
non-zero-sum game. Entrepreneurs played the game significantly more exploita- 
tively, except for those with more maladjusted LOA patterns, who were as 
cooperative as the bureaucrats. Several implications of these findings are 


considered. 


“The ideal pecuniary man," wrote Thor- 
stein Veblen (1899), “is like the ideal delin- 
quent in his unscrupulous conversion of goods 
and persons to his own ends, and in a callous 
disregard of the feelings and wishes of others 


. and of the remoter effects of his actions 


... [p. 237]”—a mordant profile of the 
entrepreneur etched rather more acidly than 
Bentham’s grand portrait of that rationally 
utilitarian exemplar of the “hedonistic calcu- 
lus” 100 or so years earlier. Robber baron 


3 or apostle of “the greatest good to the greatest 


number,” the actions of the entrepreneur in 
the marketplace have, for centuries, engaged 


. theologians and philosophers, economists and 


psychologists in defense, attack, or in a quest 
for explanation. 

'The psychological study of economic bar- 
gaining was given great impetus by game 
theory, succinctly introduced to psychologists 
by Luce and Raiffa’s (1957) Games and 
Decisions, an analysis derived from Von- 
Neumann and Morgenstern’s (1947) earlier 


. classic, Through the lens of game theory, a 


broad spectrum of interpersonal bargaining— 
including economic bargaining as one case 
—could be viewed as decision making under 
conditions of risk or uncertainty, offering the 
possibility of describing (rather than pre- 


` scribing or assuming) the utilities of the 


participants in the enterprise. 


1 This experiment was supported by University of 
Connecticut Research Foundation Grant 400-5-5L 
to the author. M 

21 would like to thank Peter Lawner for his 
able assistance in the conduct of the experiment and 
the scoring and analysis of the data. Lane Conn, 
Karl Hakmiller, Kenneth Ring, and Julian Rotter 
made perceptive comments of great helpfulness in 
preparing this paper. 


In the experimental study of interpersonal 
and economic bargaining, a miniature con- 
flict of interest situation, the two-person, non- 
zero-sum game, or variations of it, has been 
frequently employed. It possesses the essen- 
tial characteristics of a bargaining situation: 
(a) a conflict of interest between the partici- 
pants; (b) the provision of ‘more than one 
outcome (i.e., a bargain can be struck, or the 
players can fail to conclude a bargain); (c) 
as implied by 5, the possibility of collabo- 
rative or cooperative strategies (bargains). 
In such games, further, decisions are made 
under conditions of uncertainty: on any trial, 
one player does not know what choice the 
other is going to make. Thus, these two- 
person games parallel those numerous real- 
life conflict of interest situations in which the 
participants do not mutually disclose their 
strategy preferences. The study of bargaining 
with such games has also entailed decision 
making under uncertainty in pursuit of the 
utility of the marketplace—money. 

A number of variables have been shown 
to be related to choice of strategy in games 
of this type: trust (Deutsch, 1960); threat 
(Deutsch & Krauss, 1960); power (Flynn, 
1961; Solomon, 1960); the need variables 
of aggression, autonomy, abasement, and 
deference (Marlowe, 1963); cooperative, 
individualistic, or competitive motivational 
orientations (Deutsch, 1962); and interna- 
tionalism versus isolationism (Lutzker, 1960; 
McClintock, Gallo, & Harrison, 1965). De- 
spite these relationships, there is strong evi- 
dence to indicate that, normatively, subjects 
choose to compete or to exploit one another 
(McClintock et al., 1965; Minas, Scodel, 
Marlowe, & Rawson, 1960; Scodel, Minas, 
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Ratoosh, & Lipetz, 1959). The predominance 
of competitive bargaining in these games 
raises a question about their generality as 
models of such interpersonal situations as 
economic bargaining. Subjects appear to bar- 
gain according to a maximization of difference 
principle by which the primary utility is to 
outdo the other player (or to prevent him 
from doing better than oneself). Competitive- 
ness and maximization of difference are, of 
course, the social norm of games; their preva- 
lence in these two-person bargaining situa- 
tions might be taken to suggest that subjects 
treat them more as games, If true, this 
would seriously weaken the value of these 
games as miniature models of complex bar- 
gaining situations. If, on the other hand, 
strategy choices in bargaining games are 
related to characteristic motives and orienta- 
tions of the players, especially motives and 
orientations pertinent to economic and inter- 
personal bargaining, there is suggestive evi- 
dence of the generality of such games. 

The primary purpose of this experiment 
was to investigate the correlates of two- 
person, non-zero-sum game behavior in the 
economic motivation of the subject. It 
sought to determine whether an individual- 
istic, risk-taking, entrepreneurial orientation 
is associated with a competitive game strat- 
egy, and conversely whether subjects with a 
more bureaucratic orientation would more 
Írequently seek bargaining agreements. 

The measure of economic orientation was 
not taken on the subject himself, however, 
but on the subject's family. Subjects were 
classified as coming from entrepreneurial or 
bureaucratic families using the criteria of 
Miller and Swanson (1958). These investi- 
gators (1958, 1960) found differences in 
child-rearing practices between entrepreneur- 
ial and bureaucratic parents and in the ex- 


TABLE 1 
Non-Zero-Sum GAME PAYOFF MATRIX 


Player 2 
Player 1 
Black Red 
Black $.03, $.03 0, $04 
Red $05 0 $01, $01 
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pressive styles of children from these types 
of families. To the degree that an entrepre- 
neurial or bureaucratic value system pervades 
the attitudes and child-rearing practices of 
parents, it clearly should be reflected in the 
value systems—and bargaining strategies—of 
their children. 

One difference of potential importance be- 
tween the children of entrepreneurial and 
bureaucratic families appears to lie in the 
goals set for them by their parents, especially 
goals relating to behavior control and com- 
petitive achievement, As Miller and Swanson 
(1960) observe, “sons of entrepreneurs in the 
middle class are subjected to the most intense 
parental pressures [p. 390]." To investigate 
the effects of patterns of goal setting on bar- 
gaining and the interaction of goal-setting 
patterns and economic orientation, a measure 
of level of aspiration was included in the 
experiment. 


METHOD 


A total of 76 University of Connecticut introduc- 
tory psychology students, 28 males and 48 females, 
participated in pairs. In counterbalanced order, they 
were individually administered the Rotter (1942) 
Level of Aspiration Board and played together in 
a two-person, non-zero-sum game of “prisoner’s 
dilemma” form. Males were paired with males and 
females with females in playing the game. 

In the game, subjects were seated on opposite 
sides of a partition, and the procedure was ex- 
plained to them, In front of each subject was a 
small panel with two buttons, one black and one 
red, and the game matrix was posted in front of 
him on the partition. As shown in Table 1, the 
game provided for a joint payoff of $.03 if each 
player made the black (cooperative) choice, $01 if 
each pressed red, and $.04 to the red player and 
nothing to the player choosing black in the case of 
mixed choices. Choices were made by each player 
simultaneously on a signal from the experimenter. 
The subjects were told that the game would con- 
tinue for 20 trials, that at no time could they 
communicate with each other, and that they might 
keep their payoffs from each trial. In describing the 
game to the subjects the use of words like “game,” 
“play,” “win,” etc., was carefully avoided to minimize 
their impression of the task as a competitive game. 

The Level of Aspiration (LOA) Board is a goal- 
setting technique in which the subject’s task is to 
hit a little steel ball down a grooved board at the 
end of which is a series of numbers from 1 to 10 
and back down to 1. Before each trial, he announces 
his expected score. From the cumulated difference 
between previous success or failure and subsequent 
estimate (D score) and the number and type of 


4 shifts in estimates, a series of patterns describing 
the individual's overall approach to the setting of 
goals can be determined. LOA patterns were assigned 
according to Rotter's (1954) criteria and divided 
a priori into three groups: Patterns 1 and 3, repre- 
senting achievement-oriented, essentially realistic ad- 

. justments to success and failure (e.g, stability; the 
absence of inappropriate shifts in estimates—down 
after success or up aíter failure; and striving for 

?' goals somewhat above previous achievement); Pat- 

<^ terns 2, 4, and 7, the more overcautious and failure- 

avoidance patterns; and Patterns 5, 6, 8, and 9, 

-styles of goal setting involving avoidance of self- 

evaluation, wishfulness, and a tendency to leave the 

J reality of the situation (e.g., inappropriate shifts and 

| unrealistically high estimates). 


The entrepreneurial or bureaucratic orientation of 
each subject's family was determined according to 
Miller and Swanson's (1958) criteria from a ques- 
tionnaire filled out by each subject. Entrepreneurial 
parents are those engaged in risk-taking occupations 
(farmer, small businessman, lawyer, or doctor, etc.) 
ór those with a history of socioeconomically com- 
` . petitive, “individuating” experiences (e.g, being born 
and raised on a farm). The primary criterion of 
bureaucracy is employment in a relatively large 
. organization of complex structure. 


5 RESULTS 


Tables 2 and 3 show the major findings. 
By an unweighted means analysis for dis- 
proportional cell freqencies (Table 2), the 
effect of family orientation approximates 
significance (p = .06); subjects from entre- 
preneurial families bargained more exploita- 
;. tively than subjects with bureaucratic par- 

ents. The interaction of family orientation 

and LOA is significant (p < .02). Looking 
at the mean number of black (cooperative) 
plays of each subgroup in Table 3, the type 
> of LOA pattern makes no difference among 
'bureaucratic subjects. For entrepreneurs, 
however, it does: those with Patterns 5, 6, 8, 
or 9 played significantly more cooperatively 


TABLE 2 
ANALYSIS OF VARIANCE OF GAME BEHAVIOR 


Ce Source a | ss |MS| F 

Entrepreneurs versus 1 76.39 | 76.39 | 3.79* 
bureaucrats (A) 

LOA patterns (B) 2 | 190.22 | 45.11 | 2.24 
AXB 2 | 167.39 | 83.69 | 4.15 
Error 70 |1411.99 | 20.17 

Ca Total 75 

Y *p =.06. 
wD «.02. 
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TABLE 3 


MEAN NUMBER or BLACK (COOPERATIVE) PLAYS OF 
THE VARIOUS GROUPS 


LOA pattern groups 


onamily, | tand3 | 2,4,and7 |5,6,8,and9 


N M N M N M 


6.05, | 12 | 4.42, | 11 | 10.91, 
1L67,| 14 | 864, | 3 | 833. 


Entrepreneur| 21 
Bureaucrat 15 


Note.—Groups containing a subscript in common do not 
differ significantly. 


(t = 2.66, p < .02) than those with Patterns 
1 and 3 or those with Patterns 2, 4, and 7 
(t = 3.40, p < .01). The LOA Pattern 1 and 
3, and 2, 4, and 7 groups do not differ 
significantly. 

As a check on the reliability of the major 
finding, those pairs in which 'entrepreneurs 
bargained with entrepreneurs and bureaucrats 
bargained with bureaucrats were contrasted in 
the joint number of black plays. Here, the 
N was the number of pairs. The difference 
between entrepreneurs and bureaucrats is in 
the predicted direction and highly significant 
(£ — 2.91, p < .01). 

If the black plays of individual entre- 
preneurs and bureaucrats are examined in 
symmetrical (entrepreneur-entrepreneur or 
bureaucrat-bureaucrat) versus mixed (entre- 
preneur-bureaucrat) pairs, the difference be- 
tween the mean number of black plays of 
paired entrepreneurs and paired bureaucrats 
is highly significant (Ms, respectively, of 5.97 
and 10.77; t = 3.97, p < .01), while in mixed 
pairs the difference is in the expected di- 
rection (My = 8.64, Mp = 9.07) but non- 
significant, 

The conclusion of an agreement is repre- 
sented by a play in which both participants 
choose black, and the clearest case of a 
bargaining failure is the red-fed play. 
Bureaucrat-bureaucrat pairs had the highest 
mean number of joint black plays and 
entrepreneur-entrepreneur pairs the least, but 
the probability of the difference is .11. Mixed 
pairs were intermediate in joint black plays. 
For bargaining failures, entrepreneurial pairs 
had significantly more red-red plays than 
bureaucratic pairs (£— 3.39, p< .01); the 
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difference between bureaucratic and mixed 
pairs approximates significance (t= 2.01, p 
< .10), with the bureaucratic pairs having 
the lower number of red-red plays. The dif- 
ference between entrepreneurial and mixed 
pairs is in the expected direction but not sig- 
nificant. 

Rather than black or red plays, if the 
amount won is taken as an index, the results 
are essentially similar to those presented 
above: paired entrepreneurs win the least and 
paired bureaucrats the most money; entre- 
preneurs playing bureaucrats are intermediate 
and not significantly different from each 
other. These parallel findings, of course, nec- 
essarily follow from the outcome provisions 
of the prisoner’s dilemma game. 

Finally, entrepreneurial males played sig- 
nificantly more cooperatively than entrepre- 
neurial females (t= 2.39, p < .05); among 
bureaucrats this sex difference did not appear. 


Discussion 


In confirming the major hypothesis, the 
findings of this experiment suggest that the 
strategies chosen by subjects in two-person 
conflict of interest games may well possess 
generality. Subjects classified on the basis 
of the economic orientation of their parents 
pursue utilities and employ game strategies 
consistent with and evidently derived from 
the individualistic and competitive or organi- 
zationally directed and cooperative character- 
istics of their parents. It is likewise clear that 
the major finding contributes to the validity 
of the Miller-Swanson distinction. Appar- 
ently, the risk-taking or bureaucratic experi- 
ences and attitudes of parents are carried 
over in child-rearing practices to be perpetu- 
ated in the competitive or cooperative behav- 
ior of their offspring. The entrepreneurial- 
bureaucratic findings are not an artifact of 
social class differences. Socioeconomic status 
was estimated from parental occupation, and 
when the entrepreneurial and bureaucratic 
groups were compared no difference in SES 
was found. Neither was a relation with 
intelligence obtained, 

Bargaining agreements in conflict of inter- 
est. situations, as Deutsch and Krauss (1960) 
point out, have a greater chance to occur if 
the cooperative goals of the participants pre- 
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vail over competitiveness. When the entrepre- < - 


neur bargains, his objective is clearly com- 
petitive—to outdo the other bargainer or to 
prevent him from doing better than himself— 
although the outcome is to the monetary 


detriment of both. If the entrepreneurs in this” 


experiment could not really be portrayed 
in Veblen's (1899) sardonic terms as un- 
scrupulous, it is also true that they showed 


something of a disregard for the cooperative : 


overtures and trustfulness necessary to lubri- 
cate the path of agreement. This was es- 
pecially true when entrepreneurs engaged 
entrepreneurs. Here, competitive interests 
dramatically predominated. When entrepre- 


neurs played bureaucrats, there was a clear ' 


trend for bargaining failures to occur more 
frequently than among bureaucrat-bureaucrat 
pairs. 

The effect of entrepreneurial or bureau- 
cratic orientation, however, is not simply due 


to strategy preferences independent of the. 


bargaining behavior of the other player. 
Competitive bargaining was most clearly ob- 
served when entrepreneurs were paired, while 
cooperative strategies emerged most strongly 
in bureaucratic pairs. In effect, driving a hard 
bargain or making bargaining overtures tend 
to be reciprocated. In mixed pairs, entrepre- 
neurs were affected by the cooperative ovet- 
tures of their bureaucratic opponents, tending 
to respond less competitively. A similar trend, 
in the direction of less cooperation, was ob- 
served among the bureaucrats in mixed pairs. 


— IM 


Although the differences between paired and | 


unpaired entrepreneurs and bureaucrats are | 


not significant, they serve to point up à fact 
about behavior in the prisoner’s dilemma 


game—that strategy choices are in part inter- | 


dependent. 

Competitiveness in bargaining is not neces- 
sarily to be viewed as aggressive. In fact, 
Miller and Swanson (1960) report that the 
children of entrepreneurs are subject to strong 


parental training in the inhibition of aggtes - 


sion, and they also found a trend among 
children of entrepreneurial parents to develop 
inhibitory and “self-modifying” defenses: 
How does the entrepreneur, then, escape the 
aggressive implications of exploitativeness? 
The Protestant ethic, of course, exempts €C 
nomic competition from the domain of a& 


fe 
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E 
. gression, making it not only permissible but 
laudable, and it may be in this value system 
that the entrepreneur finds sanction. The co- 
"operative behavior of the bureaucrat, if 
Miller and Swanson are correct, seems to be 
- adopted from parental emphasis on fitting in, 
on being accepted, and on playing one role 
among many interlocking roles in complex 
social organizations. Cooperativeness in these 
terms seeks social acceptance and security. 
Entrepreneurial subjects with more dis- 
turbed, “irreal?” LOA patterns deviated from 
the exploitative bargaining strategy of other 
entrepreneurs, These individuals may repre- 
sent entrepreneurial “casualties” in the sense 
that they seek to avoid competition. In dis- 


' * tinction to the other groups of LOA patterns, 
= the 5, 6, 8, and 9 group is characterized by 


by 


v 


d 


=o 


* inability to deal realistically with competition. 
"Behaviorally, these goal-setting styles involve 
the avoidance of evaluation by wishfulness. 
.The bargaining situation allows a kind of 
“avoidant cooperation” in which cooperation 
occurs not for cooperative goals but as an 
avoidance of the painful consequences ex- 
pected from competitive behavior. Competi- 
tiveness, of course, is more salient for entre- 
preneurs than for bureaucrats, and entrepre- 
neurs with these patterns should show the 
effects more clearly. A fruitful hypothesis is 
that the maladjusted LOA entrepreneurs are 
subjected to more extreme parental demands 
and higher goals as children. In this con- 
nection, it is interesting to note the trend 
toward the more frequent occurrence of 
maladjusted LOA patterns among the 
, entrepreneurs.? 

There is, finally, no obvious explanation for 
the greater competitiveness of entrepreneurial 
females over entrepreneurial males. Perhaps 
the daughters of entrepreneurs are given more 
rigorous competitive training in childhood, or 
alternatively competitiveness "takes" better 
for females in child rearing because of their 
greater compliance. 

3 Collapsing the LOA Pattern 1 and 3, and 2, 4, 
and 7 groups and casting the resulting frequencies 
in a fourfold table, x? — 3.01 with a probability less 
than .10. 
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2 experiments are reported that examine the following hypotheses about cog- 

nitive consistency: After a person has responded to a series of opinion items 

that includes propositions drawn from scrambled syllogisms, his subsequent 

responses to the same items become more logically consistent—a “Socratic” | 

effect; when his beliefs are changed through persuasion, logically derivable but 

unmentioned beliefs are also changed so as to maintain consistency. Support 

for these propositions was obtained by McGuire purposely using Ss of low 

intellectual achievement. Our experiments, both based on Ss of considerably 

higher academic accomplishment, do not support the Socratic effect, but i 

replicate in most essentials the findings on indirect effects of persuasion. 1 
1 


Exp. II further demonstrates that the indirect effects are not dependent on 
experimental salience of the issues. The absence of a Socratic effect may be 


due to greater initial consistency among beliefs in our Ss. 


D 


During the past 2 decades, considerable 
research has explored implications of the gen- 
eral point of view that inconsistency among 
a person's beliefs motivates changes in as- 
pects of his belief structure such as to in- 
crease consistency (e.g. Festinger, 1957; 
Heider, 1958; Rosenberg & Abelson, 1960). 
McGuire (1960a, 1960b) has recently sug- 
gested that one process through which con- 
sistency among cognitions is achieved depends 
on the elicitation of beliefs in close temporal 
contiguity. Thus if a person is asked to state 
his several views on a given issue on Occasion 
1, these views, when elicited on Occasion 2, 
will be more consistent than they were for- 
merly. McGuire has labeled this process the 
*Socratic method" of persuasion. 

The model developed by McGuire (19602, 
1960b) for assessing consistency among be- 
liefs combines probability theory and formal 
logic. With a set of beliefs that are logically 
related as components of a syllogism, full 
logical consistency by this model requires that 
the probability that the conclusion is true 
be the product of the probabilities attached 
to the premises. Given independent probabil- 
ity ratings of conclusions and premises the 
logical consistency of the set of beliefs can 
be determined. 

Using this model, McGuire tested the hy- 
pothesis of a “Socratic effect," and also of 


indirect persuasive effects mediated by trends 
toward logical consistency. That is, if per- 
suasive communications can be successfully 
directed toward certain beliefs (the minor 
premises) in the syllogistic sets, - changes 
should also occur on logically related but un- 
mentioned beliefs (the derived conclusions) 


in such a way as to maintain consistency. - 


Two additional propositions tested were based 
on the assumption of “cognitive inertia”: 
Such indirectly induced changes on logically 
related but unmentioned beliefs are smaller 
in magnitude than would be logically required 
by the amount of change on the explicit 
issues; and change on the unmentioned issue 
occurs gradually over a period of time. His 
data were interpreted as supporting all of 
these predictions. 8 

Because his model is based on a syllogistic 
relationship among beliefs, McGuire deliber- 
ately used high school seniors and college 
freshmen of demonstrated low academic abil- 
ity so that any apparent tendency toward 
consistency could not be attributed to a con 
scious attempt on the part of the subjects 


to be logical. Actually McGuire was unneces- - 


sarily cautious. Even if his subjects knew 
that a model based on formal logic was being 
applied in the study and that logical con- 


sistency among beliefs was the focal issue, 


they could not have foreseen the multipli- 
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.cative features of the model. In any case, 
the question remains as to the generality of 
McGuire's findings. Would a more intelligent 
and academically accomplished group of sub- 
jects exhibit similar tendencies toward distor- 
tions due to wishful thinking, similar Socratic 
effects, and similar logical repercussions fol- 
lowing exposure to persuasive communica- 
tions? We report here two studies that 
examine these questions. 

A second issue raised by McGuire’s studies 
is the role of salience in determining the im- 
pact of persuasive communications upon 
logically related but unmentioned issues. In 
the before-after design used by McGuire the 
before measure made the topics salient 1 week 
before persuasive communications were ad- 


. ministered. Perhaps logical repercussions 


„occur only when the topic has been made 
salient in some such way. Also, according to 
McGuire’s reasoning, making an issue salient 


: itself initiates Socratic effects. These may 


augment, oppose, or otherwise influence the 
impact of the message on the logically related 
but unmentioned conclusion. Our second 
study, therefore, uses an after-only design, 
which precludes these possible consequences 
of experimental salience. 


EXPERIMENT I! 


This experiment attempts to extend all 
essential features of McGuire’s (1960b) 
study to a group of subjects much more intel- 
ligent and academically accomplished than 
the subjects used by McGuire. There is the 
further attempt to determine the influence of 


. differences in sophistication, within the re- 


stricted range among our subjects, upon 
Socratic and message effects. If the Socratic 
effect found by McGuire is real and common- 
place, more intelligent and sophisticated sub- 
jects should be more logically consistent in 
their beliefs, and also, because their beliefs 
are embedded in more articulated cognitive 
‘structures, show less absolute change than 
McGuire’s subjects, retaining more logical 
consistency among their beliefs following per- 
suasive communications, These expectations 
are based on the assumption that students 


. in the setting of a selective university, as 


1The authors are indebted to Mary Ann Squire 
who collected the data for this experiment. 
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compared with McGuire’s low-performing 
subjects, are more frequently confronted with 
their beliefs, with ample stimulation to ver- 
balize them to self and others and thus attain 
greater logical consistency, 


METHOD 


Eighty-seven University of California students, 35 
males and 52 females, rated the probabilistic truth 
value and desirability of 48 propositions that form 
the 16 syllogisms used by McGuire (1960b).? The 
order of the propositions on the questionnaire was 
randomly determined with the restriction that three 
unrelated propositions must precede or follow any 
given proposition, The probability ratings were 
made on a 100-point scale and the desirability 
ratings on a 5-point scale. Approximately 1 week 
later the same subjects received written persuasive 
communications directed at 8 of the 16 minor 
premises, They were told that the experiment was 
a study of the effect of controversial material on 
reading comprehension. Half of the subjects received 
communications directed at one set of 4 minor prem- 
ises and half at another set of 4. Communications 
were directed at only 4 of the syllogisms so that 
subjects would have ample time to finish their tasks 
within the allotted hour and so that data could be 
obtained from each subject on some no-message 
syllogisms. The two gróups of 4 syllogisms were 
selected randomly. After reading the communica- 
tions subjects once again rated the truth of all 48 
propositions. The materials and procedure thus fol- 
low McGuire except for two respects: only 8 of 
his 16 syllogisms were used in the message conditions 
of this experiment, and a delayed-after measure was 
not. included. 

To permit comparison of different levels of formal 
education, subjects were obtained from either lower 
division courses in psychology (sophomore credit 
courses in introductory or personal adjustment) or 
from an upper division course in social psychology. 
Seventy percent of the subjects in the lower division 
courses were freshmen and sophomore; 98% of the 
subjects in the upper division course had at least 
junior class standing. 


Results : 


Socratic effects. The before-after changes in 
the no-message conditions permit testing for 


2 Several of the syllogisms and accompanying com- 
munications required slight modification to adjust 
to the differing settings of the experiments: For 
example, we could not argue that Berkeley was in 
the geographic center of the continental United 
States, nor that Berkeley’s minor league baseball 
team (which it does not have) would soon profit 
from a program to televise major league baseball to 
the city. In each case a minor adjustment only was 
required; these adjustments should have no system- 
atic impact on the outcome of the experiment. 
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TABLE 1 
MEAN PROBABILITY SCORES (X 100) iN THE No-ComMUNICATION CONDITIONS OF EXPERIMENT I 
i Excess of 
M: Mi Product of " E 
D E — | Conclusions | consi 
Syllogisms with conclusions less 
MED than premises 

Lo ivision 

First session 56.78 52.07 29.16 22.07 —7.09 

Change: First to second 02 —.91 —.17 07 6.24 
Upper division 

Tirst session 50.14 46.14 22.55 22.76 21 

Change: First to second —.2 —1.45 —.15 3.09 3.24 

Syllogisms with conclusions more 
desirable than premises: 

Lower division 

First session 55.49 50.06 34.50 42.16 7.66 

Change: First to second 1,22 8.77 55 2.10 1.55 
Upper division 

1 irst session 52.84 57.29 32.58 40.36 7.78 

Change: First to second 1.94 —.07 .26 1.78 1.52 


the Socratic effect. Following McGuire’s pro- 
cedure, the syllogisms were divided into two 
sets of four each according to whether the 
conclusions were relatively more or less de- 
sirable than the premises. The syllogisms were 
classified separately for the upper and lower 
division subjects, producing slightly different 
sets of syllogisms high and low in desirability 
for each group. Table 1 presents the basic 
data relating to the Socratic effect. 

According to McGuire’s postulate of wish- 
ful thinking the syllogisms in which the con- 
clusions were relatively more desirable than 
the premises should show a greater excess in 
the probability of the conclusions over the 
premises than the syllogisms in which con- 
clusions were less desirable than the premises. 
For the lower division subjects the excess is 
—1.09 for the conclusions low in desirability 
and 7.66 for those high in desirability. A £ 
test of the mean difference results in a signifi- 
cant value of 2.40 (p < .05). (Unless other- 
wise stated, tests of significance are two- 
tailed.) For the upper division group the 
corresponding excesses are .21 and 7.78. A 
t test of the difference between these means 
results in a nonsignificant value of 1.21. 
Unlike our lower division subjects and 
McGuire's still less sophisticated ones, then, 
our upper division subjects show no evidence 
of wishful distortion by this criterion, 

To pursue the topic of wishful distortion 
further, we correlated the excess of the desir- 


ability of the conclusions over the premises 


with the excess of the rated truth values of 
the conclusions over the premises. The rank- 
order correlation for the lower division sub- 
jects was .58, and for the upper division 
subjects, .41. The comparable figure for 
McGuire’s data (as computed by us)? is .53. 
Apparently, the pattern of distortion in the 
present experiment, though not the absolute 
amount, is comparable for both the upper 
division and lower division subjects, and also 
comparable to that found by McGuire. 
According to McGuire, a shift in the 
direction of mutual consistency—the Socratic 
effect—is indexed by an increase in the excess 
probability of the conclusions over „the 
product of the premises for the syllogisms 
that have relatively undesirable conclusions 
and a decrease for the syllogisms that have 
relatively desirable conclusions. As is evident 
from Table 1, the excess probability of the 
conclusions over the product of the premises 
increases for both sets of syllogisms in both 


classes. In both groups, however, the shift 


in the excess is larger for the low-desirability 
conclusions than for the high-desirability con- 


clusions (6.24 as opposed to 1.55, and 3.24 


as opposed to 1.52). In neither case, though, 
are the differences significant (¢ = 1.74 for 


3 We would like to thank W. J. McGuire fo. 


making these data and other materials available — 


to us. 
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* lower division subjects and .53 for upper 


division subjects). 

An alternative procedure for assessing the 
Socratic effect that does not involve any as- 
sumption about distortion due to desirability 


'is to examine the correlation between the 


probability ratings of the product of the 
premises and the conclusions in the first and 
in the second administrations of the question- 
naire, For the lower division subjects the 


“initial rank-order correlation is .75 and the 


final correlation .75. The corresponding cor- 
relations for the upper division subjects are 
.73 and .90, There is thus some movement 
toward greater correspondence between the 
joint of the premises and the conclusions for 
the upper division subjects. 


TABLE 2 


"Mean BEFORE-AFTER CHANGES IN MINOR PREMISES 


oe 


AND CONCLUSIONS FOR BOTH COMMUNICATION 
AND N0-COMMUNICATION CONDITIONS 
AND BOTH LEVELS OF EDUCATION 


Lower division Upper division 


No 


ieee Message | message | Message 
Minor premises | —1.61 | 20.51 | —0.54 17.86 
Conclusions 2.91 | 12.98 1.25 8.35 


Message effects. Analyses of the message 
effects are based on 2 X 2 X 2 analyses of 
variance (Message X Level of Education X 
Syllogism Sets) taking minor premises, con- 
clusions, and predicted-versus-obtained change 
as dependent variables. The mean changes for 
'the message conditions are presented in 
Table 2 and the analyses of variance are 
presented in Table 3. 

Minor premises. The F for messages, 34.25, 
is highly significant, indicating that change 
on the minor premises was produced by the 
persuasive communications. The only other 
significant F is for sets of minor premises. 
This F indicates that one set changed more 
in both the message and no-message condi- 
tions than did the other set. None of the 
message interactions is significant. 

Conclusions. The mean changes in the con- 
clusions of those syllogisms toward the minor 
premises of which persuasive communications 
were directed is presented in Table 2 and an 
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TABLE 3 


ANALYSIS OF VARIANCE OF THE MESSAGE-RELATED 
BEFORE-AFTER CHANGES IN MINOR 
PREMISES AND CONCLUSIONS 


Minor premises Conclusions 
Source af 
MS F MS F 
Sets (A) 1 | 245.381] 11.28* | 49.76| 1.52 
Error 6 21.74 32.65 
Education (B) | 1 4.96 79.07 | 7.77* 
Messages (C) | 1 |3284.55 | 34.25%* | 589.11 | 16.594 
XB 1 27.76 17.55 
AXB 1 2.54 10.75| 1.06 
CXA 1 23.47 48.25 
CXAXB 1 8.18 6.49 
Error (1) 6 27.82 10.17 
Error (2) 6 95.90 35.51 
Error (3) 6 55.35 34.03 
*5 <.05 
PPS 


analysis of variance in Table,3. The F of 
16.59 for the effect of the messages is signifi- 
cant (p < .01), indicating substantial change 
on the derived, unmentioned issues, The other 
significant F (p<.05) indicates that the 
lower division subjects changed more on the 
derived issues than @id the upper division 
subjects. 

Actual change in the conclusions versus 
logically required change. On the basis of the 
assumption that cognitive systems are charac- 
terized by inertia, McGuire (1960a, p. 345) 
theorized that the obtained change in the 
conclusions would be significantly less than 
that logically required on the basis of the 
following formula: 


Ap (c) =Ap (a) p (b) + Ap (b) b (a) 
+ Ap (a) Ap (b) 


The mean predicted and obtained change for 
the two classes are presented in Table 4, and 
an analysis of variance in Table 5: There was 
no significant difference between the predicted 
and obtained changes. In fact, the subjects in 


TABLE 4 


Mean LOGICALLY REQUIRED AND ACTUAL CHANGES 
IN.CONCLUSIONS FOR THE Two LEVELS 
OF EDUCATION 


Change Lower division Upper division 
Actual 12.98 8.35 
Logically required 9.80 8.75 
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TABLE 5 


ANALYSIS OF VARIANCE OF LOGICALLY REQUIRED AND 
ACTUAL CHANGES IN CONCLUSION PROBABILITY 
FOR THE Two LEVELS OF EDUCATION 


Source df MS F 
Education (A) 1 56.98 
Required-actual change (B) 1 11.88 
AXB 1 30.62 
Propositions 1 127.70 1.83 
Error 21 65.50 


the lower division group changed slightly 
more than required for logical consistency. 


EXPERIMENT II 


Both the McGuire studies and the experi- 
ment just reported find significant logical 
repercussions among beliefs that are logically 
related to beliefs at which persuasive com- 
munications are explicitly directed. The cir- 
cumstances of the studies, however, do not 
preclude the possibility that the communica- 
tion effects depend on the salience of the 
beliefs that are explicit and implicit foci of 
the communications. M the change toward 
consistency depends on the salience of the 
beliefs involved, then changes in the proba- 
bility of the conclusions should be small or 
nonexistent in the case of syllogisms the 
premises and conclusions of which are not 
experimentally salient, The present experi- 
ment examines logical repercussions of per- 
suasive communications in such a setting. It 
also serves as a further test of the hypothe- 
sis that merely eliciting logically related 
beliefs in close temporal contiguity leads to 
predictable changes. 


Method 


McGuire's .16 original syllogisms were used in an 
after-only, control-group design that Permitted as- 
sessment of effects immediately following the com- 
munication as well as delayed effects 1 week later. 
Eighty-one subjects (39 males and 42 females) par- 
ticipated in two 1-hour sessions separated by 1 week. 
The subjects were from an introductory social 
psychology course and were fulfilling a course re- 
quirement; they are thus comparable to the upper 
division subjects of the previous study. Subjects 
were told that the experiment investigated the 
learning and forgetting of controversial materials. 
At the conclusion of the second session the nature 
of the experiment was explained in full and the 
experimenter attempted to answer all questions asked. 
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In the first session each subject received com- | 


munications on the minor premises from 4 of the 
16 syllogisms, the four groups of 4 syllogisms each 
having been randomly assembled. After each of the 
communications, a number of questions were asked 
about the article so as to support the guise of an 


experiment dealing with the learning of controversial 


material. Following these communications was a 
“Controversiality Questionnaire," which contained, in 
scrambled order, the premises and conclusions from 
16 syllogisms. The truth of each statement was to 
be rated on a graphic scale ranging from 0 (highly 
improbable) to 100 (highly probable). Next came 
a section labeled “Desirability Questionnaire,” ar- 
ranged in a similar fashion, on which each statement 
was to be marked as to its desirability. In the second 
session each subject again filled out the Controversi- 
ality Questionnaire, from which a delayed-after 
measure could be computed. 

Each subject provided experimental data on 4 syl- 
logisms and control data on the other 12. This 
design permitted a control for any Socratic effects 
in the delayed measure by using as control data 
those 12 syllogisms on which the subjects did not 
receive a communication. The analyses for message 
effects are based on ¢ tests comparing the means 
of the differences between control and experimental 
groups on the respective syllogisms. The specific 
analyses are described in the presentation of results. 
All analyses are two-tailed unless otherwise indicated. 

It was necessary to eliminate one syllogism con- 
cerning internal events in Russia and a third world 
war from the analyses because in the middle of the 
experiment the U2 “spy plane” incident occurred. 
The coverage of this alarming event in the news 
media created significant effects on the rated truth 
values of the premises and conclusion (Dillehay, 
1964). An additional syllogism was randomly ae 
carded for the no-message analyses so that equa 
Ns appeared in each cell. The analyses of message 
effects are based on 15 syllogisms. 


Results 


Socratic effects. Following McGuire, as 15 
the previously reported experiment, syllogisms 
were divided into those with conclusions ps 
tively more desirable than the premises an 
those with conclusions relatively less desirable 
than premises. The excess in probability of 
conclusions over premises for these two 
groups of syllogisms is given in Table 6 in 
the first and third rows of the last column. 
The difference, while in the predicted dier 
tion, yields a statistically insignificant F aD 
.51. It thus appears that wishful thinking 
does not account for a significant distortion 


in these sets of beliefs for the subjects in this. 


experiment. As a further test for vals i 
thinking, we correlated the excess of the de 
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TABLE 6 
MEAN PROBABILITY Scores (X 100) IN THE No-CoMMUNICATION CONDITIONS OF EXPERIMENT II 


5 Excess of 
Syllogisms with conclusion less 
„desirable than premises 
First session 53.46 45.70 23.99 25.37 1.38 
Change: First to second —1.03 1.63 84 2.30 1.46 
RUD with conclusion more 
„desirable than premises 
First session 50.49 58.21 32.65 39.13 6.48 
Change: First to second 1.57 2.25 1.62 1.63 .01 


sirability of the conclusions over the premises 
with the excess of the rated truth values of 
the conclusions over the premises. If there 
is distortion due to wishful thinking, the cor- 
relation should be high. The rank-order cor- 


' relation for our data was .21; the comparable 


correlation for McGuire's data (as computed 
by us on the same 15 syllogisms used in the 
present study) is .61. There thus appears to 
be considerably more distortion due to wish- 


«ful thinking among McGuire’s subjects than 


3! 


among ours. Note, however, that the subjects 
of Experiment I, especially the lower division 
subjects, appear more similar to McGuire's 


‘in this comparison. 


Having found statistically insignificant dis- 
tortion due to wishful thinking, one can ask, 
though not optimistically, if a Socratic effect 
will be found. The changes shown in Table 6, 
while in the expected direction, are not sta- 
tistically significant. In addition, the change 
from the first to the second experimental ses- 
sions in the correlation between the proba- 


' bility ratings given to the conclusions and 


to the joint probability of the premises is 
slight, from .54 to .56. These correlations are 
considerably smaller than those obtained for 


either group in Experiment T. 


Message effects. Immediate changes in the 
minor premises. In the design used in this 
experiment a test of the impact of the mes- 
sages is a comparison of the ratings given 
the minor premises by those receiving com- 
munications on the issues and by those not 
receiving the communications, A ¢ test of 
the control versus experimental ratings 
(Table 7) shows the mean difference of 16.71 
to be highly significant (t = 5.69, df= 14, 
p < 01). 

Immediate changes in the conclusions. The 
significant changes on the minor premises are 
paralleled by significant but smaller changes 
on the conclusions (Table 7). The mean dif- 
ference in the ratings given the conclusions 
by the experimental and control groups is 
3.44 (t = 2.19, df = 14, p < .05). 

Actual changes in the conclusions versus 
logically required changes. In the after-only 
design used in the present study the logically 
required change on the conclusions is pre- 
dicted from both control-group and experi- 
mental-group data. The differences between 
experimental and control groups are used to 
determine Ap (a) and Ap (b) in McGuire’s 
formula for predicted change (see above). 
By analogous reasoning, p (a) and p (b) in 


TABLE 7 


CHANGES IN MEAN PROBABILE 
AFTER AND 


« 
TY SCORES ON MINOR PREMISES AND ON CoNcLUSIONS IMMEDIATELY 
1 WEEK AFTER PERSUASIVE MESSAGES 


Actual change on | Logically required | Actual change on 


explicit issue 


Actual change- 


change on required change 


conclusions 


(minor premise) conclusions 
Differences between control and 16.71 7.00 3.44 —3.60 
ep ee E £7.13 —338 =1.39 1.99 


Net changes for experimental group 
from immediately after to 1 week after 
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the formula are based on control-group 
values. Table 7 shows that the mean logically 
required change in the conclusions in the 
immediate-after condition is 7.0; the actual 
mean change in the conclusions is 3.44. The 
difference between these means proves to be 
statistically significant (t= 1.77, df = 14, $ 
« .05, one-tailed). The prediction, based on 
cognitive inertia, that there will be less 
change in the conclusion than required by 
change in the premises is supported by these 
data. 

This finding, which contradicts the results 
of Experiment I but agrees with McGuire's, 
invites comparison between the upper division 
subjects of Experiment I and those of this 
experiment with respect to differences be- 
tween logically required and obtained changes 
on the conclusions. An analysis of variance 
indicates that the difference between obtained 
and logically required change for the upper 
division subjects of Experiment I does 
not differ significantly from the comparable 
difference for Experiment II (F — .37). 

Changes and savings ip the minor premises 
from immediately after to 1 week after. The 
mean net change on the minor premises is 
computed as the change in the message condi- 
tions over and above change from the first 
to the second session by the no-message 
group. Thus computed, this mean reflects a 
control for Socratic and other systematic ef- 
fects, however slight these may be. Table 7 
shows the mean net change to be —7.13, 
which is significant (t= 4.02, df=14, p 
<.01). The minus sign indicates that the 
changes are in the direction of the control- 
group ratings given to these premises. 

While the reversion in belief toward the 
original position is significant, one can still 
ask what savings remain in the original 
change induced on the minor premises. This 
question can be answered by a comparison 
between control-group ratings and experi- 
mental-group ratings on the second occasion. 
For each of the 15 syllogisms the experi- 
mental-minus-control difference is positive. 
indicating that the impact of the com- 
munications did not entirely dissipate. The 
mean difference, 9.57, is highly significant 
(t= 447, df = 14, p € 01), showing sub- 
stantial savings. 
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Changes in the conclusions from immedi- 


ately after to 1 week after. The mean net 


change in the conclusions, a figure which 
again reflects changes in the conclusions 
beyond those exhibited by the group that 
received no communications 
interval, is —1.39, a value that does not 
differ significantly from O (/-— 1.21). The 
mean difference between the control and ex- 
perimental groups in mean ratings given the 
conclusions on the second occasion is 2.01. 
This difference—in the expected direction but 
statistically unreliable (¢= 1.53, df= 14, 
05 < p < .10, one-tailed)—reflects a tend- 
ency for the experimental group to rate the 
conclusions slightly more probable than did 
the control group. 

Actual change in the conclusions versus 
logically required change from immediately 
after to 1 week after. A comparison of the 
means in Table 7 indicates that the average 


change per syllogism from the first occasion . 


to the second is 1.39 units back toward the 
preexperimental positions. One can ask how: 


this change compares with the amount of: 


change in the conclusions that are predicted 
from the change in the ratings given the 
major and minor premises. The mean required' 
change is 1.99 units more than the obtained 
change, a nonsignificant difference (t = .99). 
Thus, the actual changes in the conclusions 
are in the appropriate direction when com- 
pared to changes in the minor premises, and 
they are not reliably different from the 
logically required change. 

Logical agreement between conclusions and 
joint of premises immediately after versus 
1 week after. Taking logical consistency to 
be the point of reference specified in the 
model, that is, p (c) — p (a) p (b) = 0, the 
prediction that changes toward consistency 
involving derived, unmentioned issues occur 
gradually over time can be tested by com- 
paring p (c) — p (a) p (b) obtained immedi- 
ately after the communications with compa- 
rable data obtained 1 week later. This com- 
parison, involving absolute differences from 
complete consistency, shows the mean de- 
parture per syllogism from logical consistency 


on the second occasion to be 3.27 points less . 


than immediately following the communica- 
tion. This difference is statistically significant 


during this ~ 


AE 


— — — 


wh 
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(t = 2.69, df = 14, p < .02). Interpretation 
of this finding, however, is tempered by the 
decay that has occurred in the conclusions by 
1 week after the communication. 


Discussion 
Wishful Thinking and the Socratic Effect 


We have found only scant evidence for 
distortion in logical consistency due to wish- 
ful thinking. This phenomenon was apparent 
for only the lower division subjects of Experi- 
ment I. We further found no significant 
Socratic effects in any of the three groups 
studied. The discrepancy between our data 
and McGuire’s on these points may be due 
to greater initial logical consistency for the 
subjects of the present report. Correlational 
data show positive but low-to-moderate rela- 
tionships between excess in the rated truth 
values of the conclusions over the joint of 
the premises, on the one hand, and the rela- 
tive desirability of the conclusions over the 
premises, on the other. These correlations, 
ranging considerably downward from the 
comparable coefficients for McGuire’s data, 
indicate less inconsistency attributable to 
wishful thinking among our subjects as com- 
pared to his. Our data further support the 
expectation that among our subjects the more 
educationally advanced show less distortion 
attributable to wishful thinking. These dif- 
ferences may be due to a greater concern for 
consistency on the part of more educationally 
advanced and intelligent subjects. 


Persuasion through Logical Repercussions 


Both our experiments replicate McGuire's 
findings that following changes on beliefs due 
to successful persuasion, significant changes 
in logically related but unmentioned issues 
do occur. The data from Experiment II fur- 
ther illustrate that salience of the proposi- 
tions making up the syllogisms cannot ac- 
count for the observed impact of the com- 
munication on the derived, unmentioned 
issues, since this experiment employed an 
after-only, control-group design. 

'The prediction that change on the con- 
clusions will be less than logically required 
following change on the minor premise is sup- 


ported in Experiment II but not in Experi- 
ment I. In Experiment I, the lower division 
subjects in fact changed more than would 
be required for logical consistency (though 
the excess is not significant). We find no 
significant difference between lower and upper 
division subjects in amount of change in the 
minor premises and in the conclusions, Fur- 
thermore, the absolute amount of change in 
the minor premises is comparable for our 
subjects and McGuire's. But there is, if any- 
thing, a tendency for our subjects to change 
more in the conclusions than did his, Thus, 
even though less wishful thinking distortion 
is evident in our subjects, suggesting greater 
initial logical consistency, there is a tendency 
to change more on derived issues. 

Delayed changes, assessed in Experiment 
II only, generally replicated McGuire’s find- 
ings in that there was significant reversion 
to the original position on the explicit issue, 
significant savings on this issue, and non- 
significant reversion on the unmentioned 
conclusion, McGuire, however, found a slight 
tendency (p < .10) for the reversion on the 
conclusion to be fess than predicted by the 
reversions on the premises. There is no such 
tendency in our data, even though the very 
small difference is in the appropriate direc- 
tion, Further, at the time of the delayed 
measurement we did not find a significant 
difference between the message and no- 
message groups on the derived, unmentioned 
issues. Stated differently, 1 week later there 
was no remaining impact of the communica- 
tions on the conclusions. McGuire does not 
report on this point, which is germane to any 
consideration of delayed logical impact of 
persuasion, or a “sinking-in” process. Lack 
of delayed significant effects on the derived 
issues overshadows our findíng in Experi- 
ment II of greater logical consistency 1 week 
following the communication than immedi- 
ately after. 

The replicable nature of the main finding 
that persuasive communications have impact 
upon unmentioned but derivable conclusions 
raises the question of the psychological 
mechanism at work here. As McGuire sug- 
gests, it is unlikely that a conscious effort 
toward logical consistency is operating: ‘The 
properties of the model are far too complex 
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to be understood and applied by even the 
most astute subjects. 

Perhaps changes occur in derived unmen- 
tioned issues less as a result of the logical 
structure of the relationships between explicit 
and implicit issues than of the extralogical 
cognitive relatedness of the propositions. If 
persuasion produces a change in acceptance 
of Fact A about Object X, this change may 
influence the acceptance of other facts about 
the object, especially facts that have an 
experiential relationship to Fact A or are 
“reasonably” related to 4. What we are sug- 
gesting is that the process underlying the 
observed changes in McGuire’s studies and 
ours may be due to a psychological process 
based on experience and/or judged reason- 
ableness and not to valid logical structures 
among beliefs. A prediction from these sug- 
gestions is that,when a belief about X is 
changed, nonlogical but cognitively similar 
changes should occur on related issues about 
X. The apparently logical repercussions found 


R. C. Drtrenay, C. A. Insko, AND M. B. SmitH 


in these studies may be due to the fact that 
we only looked at derived issues that were 
logically related: There may be roaches in 
the den as well as under the sink. 
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INTERPERSONAL PROBING AND REVEALING AND 


SYSTEMS OF INTEGRATIVE COMPLEXITY ' 


BRUCE W. TUCKMAN ? 
Naval Medical Research Institute, Bethesda, Maryland 


1 group of Ss of the 4 Harvey, Hunt, and Schroder personality systems was 
tested with the Self-Disclosure Scale to determine the amount of personal 
information they revealed to others, while another group responded on the 
Probing Scale in terms of the amount of personal information they probed for. 
Ss responded for 2 targets—acquaintance and best friend; items were scaled into 
2 levels of intimacy—intimate and nonintimate. Results showed that System 
III Ss, the other-directed types, revealed more than did other system types, 
across targets and intimacy levels. System- III Ss also probed acquaintance 
most, while System IV Ss, the information seekers, probed friend most. 
Overall, combined probing and revealing to friend exceeded that to acquaint- 
ance; combined nonintimate probes and disclosures exceeded intimate ones; and 
probing exceeded revealing in intimate areas, the reverse holding in non- 


intimate areas, yielding equal revealing and probing totals. 


Little is known about the kind of person 
who has a strong tendency to disclose per- 
sonal information about himself to others, 
and the kind of person who has a strong 
tendency to probe others for personal infor- 


' mation. The only research directed at the 


study of high and low revealers is that of 
Colson,? Frankfurt (1965), and Taylor 
(1965) who showed that high scorers on the 


' Self-Disclosure Scale (Jourard & Lasakow, 


1958) revealed more about themselves than 
low scorers, and a study by Jourard 
(1961b) which showed that high scorers on 
the Self-Disclosure Scale produced more Ror- 
schach responses than low scorers. Frankfurt, 
however, has failed to show a correlation be- 
tween revealing scores and scores on four 
scales of the Guilford-Zimmerman Tempera- 
ment Survey. $ 
Indirect evidence relevant to the question 
of what high revealers are like has been ob- 
tained. Colson * and Frankfurt (1965) have 


1From Bureau of Medicine and Surgery, Navy 
Department, Research Task MR005.12-2005.01, Sub- 
task 1. The opinions and statements contained herein 
are the private ones of the writer and are not to be 
construed as official or reflecting the view of the 
Navy Department or the Naval Service at large. 
The author wishes to express his appreciation to 
Irwin Altman whose suggestions and support con- 
tributed much to this research. 1 

2Now at the Graduate School of Education, 
Rutgers—The State University. 1 

3 W. N. Colson, “Self-Disclosure as à Function of 
Social Approval," unpublished manuscript, 1965. 


found that more information is gradually re- 
vealed when a person is socidlly approved of 
by another person, while Altman and Hay- 
thorn (1965) have found that partners in an 
isolated environment reveal more intimate 
facts to one another than members of non- 
isolated pairs. It appears that self-disclosure 
and social reinforcement are essential in the 
development of social relationships. Newcomb 
(1961) has also shown that information 
exchange plays a role in the acquaintance 
process which ultimately ends in the estab- 
lishment of friendships, while Jourard (1959) 
and Jourard and Landsman (1960) have 
shown that self-disclosure is a stimulant for 
others to reveal in return, thus hastening the 
acquaintance process. On the basis of these 
findings, it is expected that persons most 
inclined toward gaining and maintaining rela- 
tionships with others and most sensitive to 
social reinforcement and compatibility con- 
siderations would be most inclined toward 
self-disclosure. They would be the high 
revealers. 

Probing, the attempt to discover personal 
information about others, by directly seeking 
it, and the kinds of people that probe have 
not been studied to any great extent. On the 
surface, probing appears to be more instru- 
mental than revealing insofar as it is directly 
aimed at the goal of gaining personal infor- 
mation in an interpersonal context, while 
revealing is only indirectly aimed at gaining 
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information, namely, as a stimulant to disclo- 
sure by the other person, Revealing appears 
to be aimed primarily at social fluidity and 
only secondarily at information acquisition 
while the reverse is true of probing. Thus, it 
would not be surprising to find that probers 
and revealers are different kinds of people. 

Relevant to the question of what kinds of 
people reveal much personal information 
about themselves to others, and what kinds 
of persons probe for much personal informa- 
tion from others, is a model of individual 
functioning based on the individual's level of 
integrative complexity developed by Harvey, 
Hunt, and Schroder (1961). The theoretical 
basis of the model includes hypothetical de- 
scriptions of four system types in terms of 
the relationship of these types to other people 
as well as in terms of their orientation to 
authority, information acquisition, and the 
like. The system types lie on a concrete to 
abstract information-processing dimension 
where “concrete” and “abstract” refer to the 
level of integrative complexity of the concepts 
used for mediating between environmental 
inputs and appropriate "responses. Concrete 
concepts represent simple stimulus-response 
links where a single stimulus or nonintegrated 
stimuli lead inevitably to a single response 
possibility, while abstract concepts link inte- 
grated stimulus configurations to a variety of 
response responsibilities. 

Based on the descriptions of the system 
types given below, it would not be unreason- 
able to expect that the probing and revealing 
tendencies of people classified into one of the 
four systems of integrative complexity could 
be predicted. This would shed additional light 
on probing and revealing and provide a more 
complete picture of *probers" and “revealers” 
while, at the same time, clarifying the de- 
scriptions of the system types by broadening 
the range of system applicability. 

The primary purpose of this paper is to 
determine the relationship among the four 
system types of integrative complexity in 
terms of tendency to reveal personal infor- 
mation about oneself to others (self-dis- 
closure) and to seek to uncover, by probing, 
personal information about others. In addi- 
tion, since previous work has shown re- 
vealing to be greater for nonintimate than 
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intimate areas, and to be greater to close 
friends than to acquaintances (Altman & 
Haythorn, 1965; Colson?; Frankfurt, 1965; 
Taylor, 1965), a second purpose of this paper 
was to replicate these findings of intimacy- 
level differences and target differences in 
revealing tendencies and to see if systematic 
intimacy level and target differences could 
be obtained for probing. 

A final purpose was to determine some es- 
sential similarities and dissimilarities between 
tendencies toward revealing and probing. 


Descriptions of Four Systems of Integrative 
Complexity 


System I. Individuals classified as System I 
are highly concrete, This is typified by cate- 
gorical thinking, rigidity, overgeneralization, 
intolerance of ambiguity, and consequent reli- 
ance on externally imposed structure, namely, 
authorities, norms, rules, for ambiguity re- 
duction and self-definition. Associations with 
other persons are maintained primarily as a 
basis for guaranteeing clear definitions of the 
situation and as an absolute source of guid- 
ance. For this reason, others are necessary to 
the System I individual. 

System I subjects have been shown to be 
the highest of all system types on authori- 
tarianism (Tuckman?*) and lowest on crea- 
tivity (Tuckman, 1966). 

System II. Individuals classified as System 
II are moderately concrete. Their behavior 
is characterized by an orientation away from 
and against external sources of control. The 
System II person is opposite to the System I 
person in that the former distinguishes 
strongly between self and other and acts to 
avoid any control by other on the self, while 
the latter fails to make this distinction and 
seeks external control. Harvey et al. (1961) 
have termed this a negatively independent 
orientation, 

System II subjects have been shown to be 
highest in Machiavellianism and lowest in 
need for affiliation of all system types 
(Tuckman *). 

System III. Individuals classified as System 
III are moderately abstract. Their behavior 

*B. W. Tuckman, “Integrative Complexity Meas- 


urement: Relation to Measures of Attitudinal Orien- 
tation,” unpublished manuscript, 1965. 
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is characterized by an orientation toward 
people as a source of pleasure and guidance. 
This guidance concerns the extent to which 
behavior is in accord with role expectations. 


_ The System III person obtains this guidance 


through his ability to "take the role of the 
other” (Mead, 1934). The System III person's 
empathic relation to others and his determi- 
nation to maintain his social relationships 
makes him similar to the other-directed 
person of Riesman (1950). 

System IV. Individuals classified as System 
IV are maximally abstract. They typically 
maintain an informational interdependent 
relationship with their environment. Since 
people represent simply a part of their en- 
vironment, they relate to others in an 
informational manner, and are in no way 
interpersonally constrained. Their thinking 
processes are characterized by openness, flexi- 
bility, and an orientation toward diversity. 

System IV subjects have been shown to be 
lowest in authoritarianism (Tuckman*) and 
highest in creativity (Tuckman, 1966). 


Derivation of a Hypothesis on Revealing 


In summarizing the descriptions of the four 
system types with regard to interpersonal 
relations, it may be said that the System III 
subjects are social insofar as they are oriented 
toward others, that the basis for their social 
orientation is through a genuine attachment 
to others (rejection is threatening), and that 
their most well-developed skills are those of 
“taking the role of the other.” The System I 
subjects may also be described as social, but 
their relation to others is based on their need 
for structure for which others are instru- 
mental. If revealing functions in the service 
of getting acquainted, and gaining and main- 
taining social relationships, then we would 
expect System III and System I persons to 
be high revealers. ; 

System II persons can be best described as 
antisocial insofar as others are a potential 
source of control which is to be avoided. 
Finally, System IV persons are asocial; they 
relate to people in much the same way they 
relate to ideas—as information. If revealing 
serves the function of promoting social at- 
tachments, then we would expect System II 
and System IV persons to be low revealers. 


Thus, on the measure of revealing, Hy- 
pothesis 1 states that the system types will 
be ordered as follows: III-I, II-IV with 
System III and System I individuals highest 
in amount of information revealed and Sys- 
tem II and System IV individuals lowest in 
amount revealed. 


Exploratory Questions 


Since there are many functions that probing 
can conceivably serve, it was difficult to de- 
rive a hypothesis ordering the system types 
on probing. System IV subjects may probe 
in order to obtain information about the en- 
vironment; System III subjects may probe to 
discover the feelings of others; System II 
subjects may probe to maintain autonomy; 
and System I subjects may probe to discover 
norms. Consequently, the relationship among 
the system types on probing was left as an 
exploratory question. $ 

Based on past findings and the validity of 
the technique for measuring revealing, it was 
expected that best friend would be revealed 
to more than acquaintance and that more re- 
vealing would takesplace at nonintimate than 
at intimate levels. An exploratory question 
posed in this study was whether comparable 
target and intimacy-level findings would be 
obtained for probing as for revealing. Spe- 
cifically, the question was posed as to whether 
best friend would be probed more than 
acquaintance, and whether more noninti- 
mate than intimate information would be 
probed for. 

Finally, similarities between revealing and 
probing, other than the above, and dissimi- 
larities between the two tendencies were left 
as exploratory in nature. 


METHOD 


Subjects 


Subjects were 299 Naval recruits about to begin 
training in Radioman's School at Bainbridge Naval 
Training Center5 Of the total, 155 subjects were 
used to study revealing, while the remaining 144 were 
used to study probing. Both samples were drawn at 
random from the same general population. Subjects 


5'The author would like to thank A. D. Garvin, 
United States Navy, Commanding Officer of the 
United States Naval Service School Command, Bain- 
bridge, Maryland, for providing the subjects used in 
this study. 
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in each sample ranged in age from 17 to 24 with 
a median of 18. The intelligence range for each 
sample, as measured by the Navy General Classifica- 
tion Test (GCT), was 50-70 with a median of 56 
(about 112 IQ). On this basis, the samples were 
judged to be comparable. 


Measurement of Integrative Complexity 


All 299 subjects took the Interpersonal Topical 
Inventory of Integrative Complexity (ITI) de- 
veloped by Tuckman (1966). Briefly, it is a forced- 
choice instrument in which the subject is asked to 
choose one of a pair of items that best represents his 
feeling about or reaction to an interpersonal situa- 
tion. Each item represents a typical response for 
one of the systems of integrative complexity. Items 
are paired so that all possible system comparisons 
occur an equal number of times. Two pairs of items 
taken from the ITI appear below as illustration. 


When I am criticized . . . 


a.I try to take the 
criticism, think about 
it, and value it for 
what it is worth. Un- 
justified criticism is as 
helpful as justified 
Criticism in discover- 
ing what other peo- 
ple's standards are. 


c. I try to determine 
whether I was right or 
wrong. I examine my 
behavior to see if it 
was abnormal. Criti- 


b.I try to accept the 
criticism but often find 
that it is not too justi- 
fied. People are too 
quick to criticize some- 
thing because it doesn't 
fit their standards, 


d. It could possibly be 
that there is some 
misunderstanding about 
something I did or said. 
After we both explain 


cism usually indicates 
that I have acted 
badly and tends to 
make me aware of my 
own bad points, 


our viewpoints, we can 
probably reach some 
sort of compromise. 


The subject is given four scores, each representing 
his total number of choices for each of the four 
systems, Based on norms obtained from 425 persons 
from the same population, system scores were con- 
verted to decile scores, and the subject was assigned 
to that system in which he scored in one of the 
three top deciles providing his decile scores in the 
other three systems were not as high as his higest. 
From the revealing sample and similarly from the 
probing sample, 39 subjects were classified as Sys- 
tem I, 26 as System II, 25 as System III, and 22 
as System IV. Of the total number of 299 subjects, 
224 (75%) were thus classified, 112 in the revealing 
sample and 112 in the probing sample. The re- 


®Toward the end of the study, subjects’ scores 
on the ITI were calculated before proceeding to the 
rest of the battery. Once the number of subjects 
of a particular system type in the probing sample 
equaled the corresponding number in the revealing 
sample, no additional subjects in that system were 
given the rest of the battery. 
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mainder were rejected as unclassifiable because they 
scored equally high in more than one system or not 
high enough in any. 


Measurement of Revealing 


Tendency to reveal information about the self to 
others was measured using the 25-item version 
(Jourard, 1961a) of the Self-Disclosure Scale de- 
veloped by Jourard and Lasakow (1958). The items 
of this scale each represent a class of information 
about the self (e.g., church membership, future aims, 
details of sex life, etc.). The subject is instructed to 
indicate with respect to each item whether the target 
person knows the information about him as a result 
of his having told it to the target person. The 
subject responds to the set of items twice, once for 
a target person labeled as a casual acquaintance and 
a second time for a target person labeled as best 
friend (same sex). The order of the targets is always 
the same. 

In addition to the differentiation of revealing be- 
havior based on target, a breakdown of revealing 
based on level of intimacy was also accomplished 
after the subjects had completed the scale. The 25 
items had previously been administered to a large 
college sample with instructions to rate the level of 
intimacy of each item (Taylor, 1965). The calcula- 
tion of scale values for each item led to the identifi- 
cation of two clusters of 10 items each: relatively 
nonintimate disclosures (lowest scale values), and 
relatively intimate disclosures (highest scale values). 
The 5 items having the highest scale values were 
included in the scale for embedding purposes but 
excluded from the data aanlysis.7 

Amount of self-disclosure was then subdivided by 
target and intimacy level yielding four scores for 
each subject, as follows: number of nonintimate 
disclosures to acquaintance, number of intimate 
disclosures to acquaintance, number of nonintimate 
disclosures to friend, and number of intimate dis- 
Closures to friend, (In each of these four areas, the 
maximum score obtainable was 10.) This repre- 
sented the basic data for all analyses involving 
revealing. 


Measurement of Probing 


The Self-Disclosure Scale was also used to measure 
Probing except that the instructions were changed 
(Frankfurt, 1965). The subject was instructed to 
indicate with respect to each item whether he knows 
the information about the target person as a result 
of having asked about it. Each subject responded 
to the 25 items twice, first with casual acquaintance 
as the target person, and second with best friend 
(same sex) as the target person. In each case, items 
Were separated according to level of intimacy into 


"On the 25-item version of the Self-Disclosure 
Scale (Jourard, 1961a, p. 316), the following items 
(identified by number) were scaled as nonintimate— 
1, 3, 4, 5, 6, 8, 11, 12, 13, and 14; the following 
items were scaled as intimate—15, 16, 17, 18, 19, 21, 
22, 23, 24, and 25. 
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TABLE 1 


ANALYSIS OF VARIANCE OF REVEALING AND PROBING SCORES By SYSTEM (A), 
Tarcet (B), AND Intimacy LEVEL (C) 


Revealing Probing 
Source df 
MS F MS F 
Between subjects 111 
hu 3 26.9 3.36%" 1.2 0.20 
Subjects within 108 8.0 5.9 
Within subjects 336 
B 1 1147.5 358.59 1724.6 410.62: 
AXB 3 1.9 0.59 10.6 2.52* 
B X subjects within 108 3.2 4.2 
e 1 1212.5 319.08 er 86.6 78.7370 
AXC 3 0.9 0.24 0.5 0.45 
C X subjects within 108 3.8 
BXC 1 43.1 15.96 12.0 3.87% 
AXBXC 3 34 1.15 24 0.68 
B X C X subjects within 108 2.7 3.1 
*p «.07. 
»* p < 025. 
werk p «01. 
* bk p < .001. " 


nonintimate and intimate probes. The item break- 
down was the same as that used in revealing. Amount 
of personal probing was then subdivided by target 


' and intimacy level yielding the same basic four scores 


as in the case of revealing. 


Procedure 


A separate group of subjects was tested for 
probing and revealing analyses. Since the two meas- 
ures are identical except for instructions, the pos- 
sibility of subjects failing to discriminate between 
the two was avoided by using different subjects for 
each measure, In both samples, the ITI was admin- 
istered first and then other tests, used for another 
purpose, were interpolated between administration 
of the ITI and the Self-Disclosure or Probing Scale. 
On these latter measures, the “acquaintance as 
target” version was presented before the “best friend 
as target” version in all cases, The entire battery 
was always administered in one session. However, 
the number of subjects tested in each session varied 
as a function of subject availability. (Neither the 
entire revealing sample nor the entire probing sample 
was tested at one time.) 


` Data Analysis 


The following three-factor analyses of variance 
with repeated measures on two factors and un- 
equal n’s (unweighted-means solution) —described by 
Winer (1962, p. 374)—were done on the data: 

1. A4 X 2 X 2 analysis for: systems, LIV; targets, 
best friend and casual acquaintance; and intimacy 
levels, nonintimate and intimate, on data from the 
Self-Disclosure Scale (N — 112). 

2. A 4 X2 X 2 analysis containing the same factors 
and levels as above on data from the Probing Scale 


(N = 112). 


3.A 2X2X2 analysis for: exchange processes, 
probing and revealing; targets, best friend and 
casual acquaintance; and intimacy levels, noninti- 
mate and intimate (N = 224). 

The three analyses were performed separately 
rather than as a single four-factor analysis because 
of heterogeneity of variance considerations. The third 
analysis was performed in order to compare the 
processes of revealing and probing directly. 


RESULTS 
System Effects 


Revealing. Analysis of variance of reveal- 
ing scores by system (A), target (B), and 
intimacy level (C) appears in Table 1, The 
main effect for systems was significant (F = 
3.36, p < .025), while all interactions involy- 
ing system were not significant. The mean re- 
vealing score for each of the four systems 
appears in Table 2. The systems were ordered 

TABLE 2 


MEANS FOR THE Four SYSTEMS ON 
TOTAL REVEALING SCORE , 


System 


I II III IV 


22.2 26.3» 21.7 


24.28 


«Larger than the means for System II and System IV 
(p, <..0i) by Duncan range test. 

b Larger than the means for each of the other three systems 
(p «.01) by Duncan range test. 


Bruce W. TUCKMAN 


660 
TABLE 3 
MEANS FOR THE FOUR SYSTEMS ON 
PROBING BY TARGET 
System 
I II IH IV 

Acquaintance |- 8.4 78 9,1 7.4 
Friend 15.7 16.0 15.7 17.0° 
Total 24.0 23.8 24.8 24.4 


a Larger than the mean for System IV (p < .10) by Duncan 
i coe than the mean for System II (p <.05) and the 
mean for System IV (p < .01). 

eLarger than the mean for System II (p <.10) and the 
means for Systems I and III (p < .05). 
as follows: IIT, I, II, IV with the mean for 
System III highest. All differences were sig- 
nificant except the difference between Systems 
II and IV. Thus, Hypothesis 1 which pre- 
dicted a system ordering of III-I, II-IV on 
revealing was confirmed by the data with the 
addition that System III was discriminably 
different from System I, a finding not pre- 
dicted. Furthermore, this obtained ordering 
(the main effect of systems) was contingent 
upon neither target nor intimacy level as 
indicated by insignificant System X Target 
(A X B) and System X Intimacy Level (A X 
C) interactions. 

Probing. The analysis of variance of 
probing scores by system, target, and inti- 
macy level also appears in Table 1. The 
main effect for systems was not significant 
(F= .20). However, the interaction be- 
tween systems and target (A X B) was sig- 
nificant at the .07 level (F — 2.52), and 
therefore was explored further. The mean 
probing score by target for each of the four 
systems appears in Table 3. On probing of 
acquaintance, the systems were ordered as 
follows: III, I, II, IV with significant dif- 
ferences between System IIT and Systems II 
and IV and a difference approaching signifi- 
cance between System I and System IV. The 
system ordering on probing of acquaintance 
is identical to the ordering of the systems 
on total revealing. 

On probing of friend, the systems were 
ordered as follows: IV, IT, I-III with signifi- 
cant differences between System IV and Sys- 
tems I and III and a difference between 
Systems IV and II that approached signifi- 
cance, The finding that the ordering of the 


systems on probing was a function of target 
will be discussed later. 


Target and Intimacy-Level Effects 


Revealing. From Table 1, it can be seen 
that the main effects of both target (B) and 
intimacy level (C) on revealing are highly 
significant (p< .001 in both cases). As 
Figure 1 shows, best friend as target was 
revealed to more than casual acquaintance as 
target, and revealing in nonintimate areas 
was considerably greater than revealing in 
intimate areas. This confirms expectations 
and provides validity for the differentiation 
of target and intimacy level used in the Self- 
Disclosure Scale (ie., it indicates that sub- 
jects were following instructions and paying 
attention). 

In addition, a significant Target X Inti- 
macy (B X C) interaction was obtained. The 
interaction was based on the fact that intimate 
and nonintimate levels of disclosure to best | 
friend were more comparable than they were 
to casual acquaintance. In part, this may be 
the result of a ceiling effect limiting non- 
intimate disclosures to best friend. Half of 
the subjects “hit” this ceiling. 

Probing. From Table 1, it can be seen that 
the findings for probing parallel those for 
revealing, namely, highly significant main ef- 
fects of target and intimacy level. Figure 1 
shows that more nonintimate probing was 
done, and friend was probed more than ac- 
quaintance. 

Again, as in the case of revealing, a sizable 
Target X Intimacy interaction on probing . 


18 TARGET INTIMACY LEVEL 


15 


10 


Mean Amount Probed and Revealed 


AcQ FR 


B Revealing 
O Probing 
Fic. 1. Mean amounts probed and revealed to each 
of the targets (ACQ — acquaintaince, FR — friend); 
and at each of the intimacy levels (NON-INT = non- 
intimate, INT — intimate). 
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E TABLE 4 
COMBINED ANALYSIS OF VARIANCE OF INTERPERSONAL 
d. EXCHANGE SCORES BY PROBING-REVEALING (A), 
| Tarcet (B), AND Intmacy LEVEL (C) 
s Source df | MS F 
Between subjects 223 
3 A 1 40| 0.56 
hs Subjects within 222 7.1 
Within subjects 612 
B 1| 2842.9 | 768.35** 
AXB 1 29.8 8.05* 
B X subjects within 222 3.7 
C 1| 973.6 | 405.67* 
AXC 1| 325.5 | 135.62* 
C X subjects within 222 24 
BXC 50.2 | 17.31* 
AXBX 48 1.65 
* B X C X subjects within | 222 2.9 
^ 
i *p < 01 
wD < 001 


* was obtained (F = 3.87, p < .07). Again, as 
in the case of revealing, this tendency for 
intimate and nonintimate levels of probing of 
best friend to be more comparable than they 
were to casual acquaintance appears to be 

- based on a ceiling effect. 


Probing versus Revealing Effects 


Probing and revealing were included in a 
* common analysis in order to>determine the 
extent to which the two processes were similar 
and different. The two previous analyses, 
that is, the analyses of probing and revealing 
separately, allow some conclusions to be 
drawn about each process relative to the 
other, but such analyses are indirect. Since 
no previous work had been done on probing, 
no hypotheses were developed concerning the 
relation between the two processes, and this 
analysis was exploratory in nature. 

The analysis of variance of interpersonal 
exchange scores by probing-revealing (A), 
target (B), and intimacy level (C) appears 

. in Table 4. The main effect of probing- 
revealing was not significant (F = .56) indi- 
cating the mean level of probing (24.1) and 

* the mean level of revealing (23.7) are compa- 
rable, Thus, individuals tend to reveal about 
the same amount as they probe in personal 
areas. i 

The main effect of target and the main 
effect of intimacy level were both significant. 
This follows directly from the previous two 
analyses in which both effects were also sig- 


nificant, and, thus, this analysis provides no 
new information. A significant Target X Inti- 
macy interaction also appears in this analysis 
as it did in both prior analyses and is not 
surprising. 

A finding uniquely shown in this analysis 
is a significant interaction between probing- 
revealing and target (A X B). This inter- 
action is displayed graphically in Figure 1. 
Using the Duncan studentized range statistic 
(Winer, 1962, p. 77) with m = 112, it can 
be shown that this significant interaction is 
based on the fact that friend is probed (M — 
16.0) significantly more (p < .01) than he 
is revealed to (15.0) while casual acquain- 
tance is revealed to (8.6) sigfinicantly more 
(p «.05) than he is probed (8.1). While 
both differences are significant, they are quite 
small. Based on Figure 1, probing and re- 
vealing processes are roughly comparable 
across targets even though the statistically 
significant interaction would tend to obscure 
this comparability. j 

A second unique contribution of this analy- 
sis, and a substantial one, is the significant 
interaction betwee probing-revealing and 
intimacy level (A X C). This interaction is 
displayed in Figure 1. Analysis by Duncan 
range test (n = 112) shows that this signifi- 
cant interaction is based on the fact that 
nonintimate information is revealed (M = 
15.1) significantly more (p < .01) than it is 
probed for (13.0) while intimate informa- 
tion is probed for (11.2) significantly more 
(p< 01) than it is revealed (8.6). Both 
differences are sizable. Figure 1 clearly shows 


TABLE 5 


MEANS FOR THE FOUR SYSTEMS ON PROBING AND 
REVEALING TO ACQUAINTANCE (Acq) AND BEST 
Frrenp (Fr) ror NONINTIMATE (N) AND 
Intimate (I) LEVELS OF 
INFORMATION 


Revealing Probing 


System Acq Acq Fr 


I |6.4 |24 | 89 | 64 4,7 | 3.7 | 8.2) 7.4 

Ir 5.6 | 2.2 | 8.7 | 5.8 | 4.4 | 3.4 | 8.3 | 78 
IH 73 | 3.0 | 9.1 | 6.9 | 5.1 | 4.0 | 8.2 7.5 
IV |57|17|88|55 4.6 | 2.8 | 8.6 | 8.4 
ee ee a TRTLER Le eral 
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z0 

ACQ FR ACQ FR ACQ FR ACQ FR 
TARGET B Revecling 
0 Probing 


Fic, 2. Mean amounts probed and revealed by 
each of the systems to each of the targets (ACQ= 
acquaintance, FR = friend). 


that the processes of probing and reveal- 
ing behave considerably differently across 
intimacy levels, 

As a way of summarizing the data, a table 
of means for the System X Target X Inti- 
macy X Probing-Revealing matrix (4 X 2 X 
2 X 2) appears in Table 5. 


DISCUSSION 


From the data it was possible to construct 
a picture of the systems in terms of the extent 
to which probing and revealing occurred 
across targets (see Figure 2). It may be 
noted from the figure that the probing and 
revealing patterns of System III and Sys- 
tem I are similar, as are the probing and 
revealing patterns of Systems IV and II. 
Comparable differentiation of targets and of 
probing and revealing accounts for the simi- 
larities. 

System III individuals were the highest 
revealers of all system types, and significantly 
so. This high level of revealing among System 
III subjects occurred across both targets and 
both levels of intimacy, and thus was neither 
target nor intimacy-level specific. In terms 
of total probing, System III individuals were 
comparable to individuals of the other three 
systems. However, considering the different 
targets, System III subjects probed aquain- 
tance more and friend less than subjects of 
the other systems. Overall, System III subjects 
appeared willing to reveal more than others to 
either target, but exceeded others in probing 
of acquaintance only. Perhaps additional in- 
formation about friends is acquired by wait- 
ing for the friend to reveal it. The tendency 


'(TUCKMAN 


(3 


| 


for the System III person to reveal more than , M 


probe occurred for both targets, more dram- 
atically in the case of acquaintance where 
both probing and revealing were high but 


revealing was higher, and for both intimacy’ 


levels. 

System I individuals manifested a probing 
and revealing pattern comparable to that of 
the System III subjects. They were the 


second highest revealers among the four” 


system types, and maintained this across 
targets and intimacy levels. In total probing, 
they too were at an average level, being some- 
what higher in probing of acquaintance and 
somewhat lower in probing of friend as com- 
pared to the other system types. Here again 
the System I subjects paralleled the System 


III subjects. However, this parallel, as others, . 
was one of direction; Systems III and I sub- » 


jects did differ significantly in degree on 
most of the measures, Finally, System I sub- 


" 


: 


jects tended to probe and reveal about the ` 


same amount to acquaintance, and to probe 


and reveal about the same amount to friend, 3 
both findings holding across intimacy levels. 


The probing and revealing pattern of the 


System IV subjects differed markedly from | 


that of Systems III and I subjects, as can 
be readily seen in Figure 2. System IV sub- 
jects along with System II subjects were the 
lowest revealers, regardless of target or inti- 
macy level In overall probing they were 
equal to the other three system types. How- 
ever, in probing of acquaintance they were 
again lowest, while in probing of friend they 


were highest. Thus, the System IV subjects , 


tended to reveal minimally to everybody, and 
to probe acquaintance minimally. At the 
same time, they exhibited considerable prob- 
ing of friend. In fact, their tendency toward 
friend was weighted heavily in favor of 
probing over revealing, while no differences 
appeared in tendency toward acquaintance. 
This finding held across intimacy levels. Per- 
haps probing and revealing to acquaintances 
represents a social “luxury” and is engaged in 
principally by those who are socially inclined, 
namely, Systems III and I subjects, while 
probing and revealing to friends is a “neces- 
sity” in terms of acquiring information rele- 
vant to the interaction. The best way tO 
acquire such "necessary" information may be 


<a 


See 
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to probe for it, and this is exactly what the 
informationally oriented System IV sub- 
jects, and to a lesser extent the System II 
subjects, do. 

The probing and revealing pattern of the 
System II subjects was similar to that of the 
System IV subjects, but differed significantly 
in degree on some of the measures. Systems 
II and IV subjects were similar in that both 
were the lowest revealers for all targets and 
intimacy levels. The overall level of probing 
of the System II subjects was roughly equal 
to that of the other system types; the System 
II subjects equaled the System IV subjects 
as the lowest probers of acquaintance, and 
were second to the System IV subjects as 
highest probers of friend. Thus, their levels 
of revealing and probing of acquaintance were 
equal, and low, while friends were probed 
much more than they were revealed to, at 
both intimacy levels. 

The speculation that probing of acquain- 
tances is a social “luxury” and probing of 
friends is a social “necessity” is offered to 
account for system by target findings in the 
data on probing. Other explanations can also 
be made, The System II subjects, who are 
high in probing friend and low in probing 
acquaintance, have been shown to be more 
manipulative and less affiliative than the 
other system types (Tuckman 5); conceivably 
low levels of probing of acquaintances reflect 
this low affiliativeness while high probing of 
friend represents this high manipulativeness. 
This line of reasoning would not apply to 
System IV subjects, however, who are neither 
especially low in affiliativeness nor high in 
manipulativeness. h 

Turning now to à consideration of probing 
and revealing in general, apart from the role 
of the systems as mediators of process-target 
effects, it was found that the overall amount 
of revealing and that of probing were highly 
comparable. In addition, comparable effects 
of target and of intimacy level were ob- 
tained, namely, that friends were both probed 
and revealed to more than acquaintances, and 
that both probing and revealing were greater 
in nonintimate than intimate areas. A differ- 
ence between probing and revealing was that 
probing exceeded revealing in intimate areas, 
while the reverse was true in nonintimate 


areas. This appears to reflect prevalent 
cultural norms which guide interaction in the 
direction of “finding out” more than you 
“tell” in highly personal areas while “telling” 
more than you try to “find out” in more 
superficial areas in return. 

The fact that the measurement of probing 
and revealing to casual acquaintance always 
preceded that to best friend leaves open the 
possibility that target differences in both 
probing and revealing can be accounted for 
on the basis of an order effect, perhaps fa- 
miliarity. However, the size of the target 
effect in both probing and revealing, its con- 
sistent appearance across studies (mentioned 
earlier), and its fit to expectations based on 
the meaning of the words “casual acquain- 
tance" “and best friend” minimize this 
alternative explanation. 

No attempt was made to introduce passage 
of time as an independent variable. It is 
known that both breadth and depth of self- 
disclosure change over time (Colson *; Frank- 
furt, 1965; Taylor, 1965), and these changes 
have been incorporated into a model of self- 
disclosure (Altman & Haythorn, 1965). It is 
conceivable that the findings obtained in this 
study would hold for certain points in time 
but not for others. As an example, it is pos- 
sible that after considerable time had elapsed 
in a relationship, individuals of all four sys- 
tem types would be revealing the same 
amount to the other person involved, Further 
research is needed to discover the effects of 
time on the relationships demonstrated in 
this study. 
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SPREAD OF SOCIAL INFLUENCE ON CHILDREN’S 
JUDGMENTS OF NUMEROSITY * 


HERBERT D. SALTZSTEIN, PRESTON B. ROWE, anv MARTHA E. GREENE? 


Massachusetts Institute of Technology 


In 2 studies of generalization of social influence, pairs of children made 
several numerosity judgments of 3 irregular displays of dots. Each child re- 
ceived bogus notes purportedly coming from a peer giving much higher esti- 
mates than his own estimates of the initial display. Social influence effects were 
assessed separately for the judgments of that initial display and of the other 
displays presented later in the series unaccompanied by further bogus notes. 
Boys showed definite carry-over effects to later judgments in a private as 
well as a public setting, at least for certain displays and orders of presentation. 
Girls showed more definite immediate effects on the initial display, but little, if 
any, carry-over effects to later judgments of the other displays. 


The importance of social influence lies not 
only in the immediate effects on judgments, 
as dramatic as these may be, but in the extent 
to which these effects generalize or carry over 
to judgments of new but related items. Sev- 
eral years ago Fisher and his colleagues 
(Fisher & Lubin, 1958; Fisher, Rubenstein, 
& Freeman, 1956; Zolman, Wolf, & Fisher, 
1960) demonstrated generalization effects in 
their studies of social influence on numerosity 
judgments. The basic procedure of one of 
these experiments (Fisher & Lubin, 1958) 
may serve as an example. 

The subject was asked to estimate the num- 
ber of parachutists in two irregular displays. 
Several judgments were made of each display, 
and after each estimate the subject was ex- 
posed to bogus estimates from a confederate 
which were systematically discrepant from his 
own, This induced many subjects to revise 
their own previous estimates on the first dis- 
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United States Public Health Service. Further 
aid was received from National Aeronautics and 
Space Administration Grant NaG 496 to Hans- 
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play. Furthermore, an influence effect was 
noted on the initial judgment of the second 
display even though that judgment was made 
before receiving the bogus estimates for that 
display. That is, in the experimental condi- 
tion, the subject’s initial judgment of the sec- 
ond display was systematically different from 
the initial judgment of subjects in the con- 
trol condition where they had not been ex- 
posed to prior bogus estimates by a confed- 
erate, The social influence effects manifested 
during successive estimates of the same dis- 
play may be called within-trial influence, and 
the effects manifested on the subsequent dis- 
play prior to any new influence attempts may 
be called between-trial influence. In some in- 
stances the between-trial effect was greater 
than the within-trial effect. 

In these studies Fisher and his colleagues 
have pointed to an important phenomenon 
which needs to be studied intensively, par- 
ticularly for its relevance to understanding 
the social bases for the formation of stand- 
ards, Since their experiments were among the 
first to establish between-trial or carry-over 
effects for this type of task, it is understand- 
able that not all the conditions necessary for 
these effects could be defined. In the present 
experiments we hope to delineate some of 
the essential conditions for obtaining between- 
trial influence and to gain greater control over 
those factors that in the early experiments 
might have made the initial results less in- 
terpretable. For example, in their study, the 
main measure of within-trial influence was 


. 665 


666 


based on a series of estimates given by the 
subject after only one exposure of the display. 
In our experiment within-trial change is as- 
sessed over a series of estimates each made 
after a new exposure of the display. There 
were two main reasons for this change in pro- 
cedure. First, it was felt that reexposure of 
the display before each judgment would pro- 
vide the subject with a sufficient basis for 
revising his estimate, thus avoiding induce- 
ment of a spurious stability of judgments 
within a trial. Second, since the measure of 
between-trial change was based on judgments 
after a new exposure of the display, the change 
in procedure would make more comparable 
the stimulus conditions for within- and be- 
tween-trial influence. 


EXPERIMENT I 


Since our basic interest in this phenome- 
non lies in its relevance to the formation of 
standards, children are an especially appro- 
priate group for study. Presumably their 
standards are less well formed than adults. 
Our first aim is then to confirm the phenome- 
non with children and establish more pre- 
cisely the conditions necessary for its occur- 
rence, 

In the original studies by Fisher and his 
co-workers it is impossible to determine 
whether between-trial influence reflects a 
genuine change in judgment because the esti- 
mates of the initial and subsequent displays 
were made in public. Earlier Sherif (1935) 
had demonstrated that group influence in 
judging the autokinetic phenomenon carried 
over into judgments made by the individuals 
in the alone situation, But, the spread of in- 
fluence effect has not been demonstrated si- 
multaneously across items and from public to 
private. The sécond aim of the present study 
is, therefore, to investigate these two carry- 
over effects simultaneously by testing the 
spread of influence from initial to subsequent 
items in a private as well as a public setting. 
If between-trial influence obtains in both of 
these situations, we may well conclude that, 
at least under the conditions of this experi- 
ment, between-trial influence indicates gen- 
eralization of a genuine change in judgment. 
If it obtains in the public but not in the pri- 
vate conditions, then it is likely that between- 
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trial influence reflects only a social strategy 
for avoiding disagreement. 

The third aim is to determine the degree to 
which between-trial influence was dependent 
on within-trial influence. Between-trial influ- 
ence may simply involve a direct carry-over 
of the original behavior to a new item; if so, 
we would expect carry-over effects to occur 
only in cases where within-trial effects had 
occurred, and that the two kinds of effects 
would be positively correlated. On the other 
hand, it is also conceivable that within-trial 
effects could be smaller or absent because a 
subject considers himself committed to his 
own original judgment in spite of the poten- 
tially modifying influence of discrepant judg- 
ments given by a peer. This latent tendency 
towards a change may become manifest as 
soon as a new display has been presented. 
The between-trial change might even serve as 
a substitute for within-trial change. In this 
case, correlations between within- and be- 
tween-trial influence might be zero or nega- 
tive. 


Method 


Design and subjects. There were three conditions: 
a control condition and two experimental conditions, 
namely, a public setting and a private setting. Sub- 
jects were required to give 18 estimates, 6 each for 3 
displays of dots (labeled A, B, and C). Subjects 
were exposed to a peer's judgments of C. A and B 
were the displays subsequently presented; they were 
introduced to assess whether the pressure to change 
generated during the judgments of C would spread 
to new displays. 

In the control condition, subjects made all 18 
estimates without any information about anyone 
else's estimates. In the two experimental conditions, 
subjects were exposed to a peer's estimates of Dis- 
play C in the manner described below. The two 
experimental conditions differed only in the social 
setting for the estimates of Displays A and B. In the 
public setting, subjects were told that their judg- 
ments of the subsequent displays would be made 
available to a peer at the end of the experiment. In 
the private setting subjects gave their estimates with 
the understanding that these would never be made 
available to any of the other subjects. The setting 
for judging Display C was always public. 

Seventy-four boys and 68 girls, from two suburban 
schools in the Boston area,® participated in the eX- 


2The authors are indebted to the Newton Public 
School System, and, in particular, Edward Landy, 
Assistant Superintendent for Pupil Personnel Services 
and Special Education; Herbert J. Callahan, Princi- 
pal of Pierce School; and Edith Clark, Principal of 
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periment. The scores of 10 boys and 11 girls were 
eliminated from the analysis because these 21 sub- 
jects failed to understand the instructions. The re- 
maining 121 children ranged in age from 8-4 to 
13-11. Sixteen boys and 15 girls were tested in the 
control condition; 22 boys and 20 girls gave their 
estimates in the public setting; and 26 boys and 22 
girls in the private setting. Assignment to these 
conditions was random. 

Materials. There were three displays of irregularly 
arranged dots. Each display was labeled with a 
letter: A, B, or C. In the experiment by Fisher and 
Lubin (1958) the second display in fact contained 
more dots than the first display (B and C in the 
present experiment, respectively). In the present 
experiment Displays B and C contained the same 
number of dots, 150. In fact, they were the same 
display but labeled differently and presented in dif- 
ferent orientations, This permitted us to study the 
spread of social effects on judgment from the first 
(C) to the second (B) with the relevant stimulus 
property, number of elements, controlled. 

Display A contained 160 dots. The dots formed a 
more nearly circular cluster than in the other dis- 
plays and were more densely packed together. This 
display was included to test the extent of generali- 
zation of social effects on judgment. 

The slides were projected on a 7X7 foot screen 
approximately 7.5 feet from the subject. The image 
covered about 2.8 square feet of the screen. 

Procedure. The purpose of the first series was to 
familiarize the subjects with the task and to provide 
a base line of estimates for each individual. Subjects 
in all conditions made three estimates of the three 
displays in the order ABCABCABC. Following this 
base-line series, instructions were introduced to dif- 
ferentiate the control condition from both public 
and private conditions. Slide C was again presented 
three times in succession, and the subject gave an 
estimate after each presentation. These constituted 
the influence series. In both of the experimental con- 
ditions, the subject was exposed to the bogus esti- 
mate between his first and second, and between his 
second and third judgments during this series, 

Each child in a pair served as the agent of influ- 
ence for the other. This was accomplished through 
interception of notes and substitution of standard 
influence notes in place of the original notes. The 
notes sent by the subject consisted of the subject's 
estimate of the preceding display. Substituted in 
place of these actual estimates was a figure three 
times the recipient's actual estimate. The peer ap- 
peared to give the same estimate on both occasions. 
Thus, after making his first and second estimates 


Oak Hill School for their splendid cooperation in ob- 
taining subjects and providing space for testing the 
children. The authors also wish to thank Ann Bird 
for her assistance in securing the cooperation of sub- 
jects during the pretesting period. In this same con- 
nection, thanks are due to Royden Richardson of 
the Tremont Methodist Church for giving permission 
for such recruitment and for providing space for 
pretesting. 


during the “influence” series the subject was exposed 
to an estimate by another which was discrepant 
from the subject's first estimate and bore a constant 
relationship to that first estimate (300%) regardless 
of whether the subject's second estimate was the 
same as the first estimate. The estimate by the other 
judge was always introduced following the next 
exposure of Display C. 

Displays A and B were then shown three times in 
succession in the order BBBAAA, These constituted 
the two crucial postinfluence series. Subjects made 
these estimates without exposure to any further 
estimates by a peer. The postinfluence estimates made 
by the subject served as the tests of between-trial 
influence. 

The public/private instructions were introduced 
between the influence series and the first postinflu- 
ence series (ie, between CCC and BBB) and re- 
peated between the first and second postinfluence 
series (ie., between BBB and AAA), 

In order to minimize any possible effects of vary- 
ing time intervals, the following controls were in- 
troduced: Instructions were of approximately the 
same length, and the time intervals between all 
series of estimates were kept approximately con- 
stant. Attempts were also made to fill the time be- 
tween the three successive estimates of C; while 
subjects in the experimental conditions were receiv- 
ing the bogus notes, control subjects received. their 
own previous estimates with instructions to check 
them against the reeord they had kept of all their 
estimates. (Each subject recorded each of his esti- 
mates twice; once on a slip of paper which was im- 
mediately handed to the experimental assistant and 
once for the subject's own record.) 

To insure that the children remembered their 
previous estimates, they were instructed to study the 
record of their own estimates as well as the notes 
they had received ostensibly from the peer, Fifteen 
seconds were allotted for each of these reviews which 
occurred between successive series of estimates. 

Measures and statistics, The main measure of 
within-trial influence was the change from the esti- 
mate made just before the first influence attempt to 
the estimate made just after the second influence 
attempt. The main measures of between-trial influ- 
ence were the changes from the last estimate of B 
(or A) during the base-line series to the first and 
third estimates of B (or A) during the postinfluence 
series. These constituted the initial'and final between- 
trial changes, respectively. 

Since the absolute level of the estimate might 
have affected susceptibility to influence on that 
judgment, a rough control of absolute level was in- 
cluded in the measure. This was accomplished by 
dividing the change from the earlier to the later 
estimate by the earlier estimate. So, for example, the 
main measure of within-trial influence was (Co— 
C/C. (The data were also analyzed using abso- 
lute differences. There were only slight differences in 
distribution for the two measures, with the relative 
measures slightly more sensitive in detecting effects.) 

To insure that there were no differences among 
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the three experimental conditions in preinfluence 
tendencies to change their estimates, the difference 
between the first and third base-line estimates of A, 
B, and C, respectively, was calculated, and then 
divided by the first estimate. No differences between 
conditions did obtain. 

Since the scores were highly variable and not 
normally distributed, the chi-square test or (when 
N was too small) Fisher's exact test was used to 
test differences between conditions. The major com- 
parisons were made between the control and the two 
experimental conditions, and between the public and 
private conditions. These comparisons were made 
separately for boys and girls. In each case, the joint 
median for the groups being compared was used to 
dichotomize the scores. Kendall's rank correlation 
test (tau) was used to test the association between 
within- and. between-trial influences, A two-tailed 
test was used throughout. 

Validation procedures. The children were inter- 
viewed briefly at the end of the experiment. The 
major purpose of this interview was to determine 
whether they understood the instructions and 
whether the experimental conditions were actually 
induced. In particular, attention was paid to three 
questions: whether the subject understood and 
believed that three, and only three, slides were used; 
whether the meaning of public versus private set- 
tings was understood; and whether the subjects in 
the control condition were aware of the fact that 
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they could change their estimate of C during the 
"influence" series. 


Results 


A major aim of this experiment was to 
confirm between-trial influence in children in 
a private as well as a public setting. Medians 
of the initial and final between-trial influence 
(on Displays A and B) and the main within- 
trial influence scores (on Display C) are pre- 
sented by condition in Table 1, along with the 
chi-square tests of differences between these 
conditions. Boys in both the public and pri- 
vate conditions show greater initial between- 
trial influence on Display B (150 dots) than 
do boys in the control condition. The two ex- 
perimental conditions combined show a greater 
increase than the control condition (p < .01), 
thus indicating a marked overall between- 
trial effect. The private-control difference is 
significant (p < .05); the public-control com- 
parison is not. Comparing the public and 
private conditions directly, however, there is 
no difference. 


TABLE 1 


BETWEEN-TRIAL AND WITHIN-TRIAL INFLUENCE: MEDIANS BY SEX AND CONDITION AND CHI-SQUARE TESTS OF 
DIFFERENCES BETWEEN CONDITIONS BY SEX: EXPERIMENT I 


Boys' medians 
Public 
Private 
Experimental* 
Control 
Chi-squares 
Public versus private 
Public versus control 
Tavai versus control 
xperimental versus control 
Girls’ medians 
Public 
Private 
Experimentale 
Control 
Chi-squares 
Public versus private 
Public versus control 
Private versus control 
Experimental versus control 
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^ Public and private combined. 
*p <.05. 
++p «c.l. 


Between-trial on 150-dot 


Measures 


Between-trial on 160-dot 
display (A) 


Ree ee Vv 


display (B) 


Initial Final 
.061 132 
.197 .364 
.163 .154 
-086 021 
ns ns 
ns ns 
ns ns 
ns ns 
089 .130 
189 .235 
122 .169 
0 .053 
ns ns 
ns ns 
ns ns 
ns ns 
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The girls in the two experimental (public 
and private) conditions tend to make higher 
estimates than in the control condition, but 
none of these comparisons reaches an accepta- 
ble level of significance. 

By the second postinfluence estimate of 
Slide B the difference between the experi- 
mental and control groups, although still in 
the predicted direction, no longer reaches an 
acceptable level for either boys or girls indi- 
cating that between-trial influence is not sus- 
tained. Neither boys nor girls show a statisti- 
cally significant between-trial effect on Dis- 
play A. Between-trial influence has then been 
demonstrated for boys even when there is lit- 
tle or no sensory basis for revising one’s 
judgment upward on the second display. 

Within-trial influence (for Slide C) reveals 
a different pattern of results (see Table 1). 
Here it is the girls who show clear effects both 
after the first influence attempt (5 < .001) 
and after the second influence attempt (p < 
01), The boys show only nonsignificant 
trends in the predicted direction. 

The overall pattern of results indicates that 
within-trial influence is neither a necessary 
nor a sufficient condition for the appearance 
of between-trial influence. Nevertheless, 
within-trial effects on C may still be corre- 
lated with between-trial effects on B (or A) 
in certain conditions. In fact, for judgments 
of Display B the only significant correlations 
between the main measure of within-trial 
change, (Ce — C4)/C4 and initial between- 
trial change obtain for boys in the private 
condition, The degree of association is mod- 
erately strong (tau = 402, p= .002). In 
neither the control nor the public condition 
for boys nor in any of the three conditions for 
girls does knowledge of extent of change 
within Trial C enable one to predict extent of 
change between trials to B. For the second 
spread display (A), the only significant rela- 
tionship again obtains for boys but this time 
in the public condition (tau = .256, ? < .05). 


Discussion 

The between-trial influence effect proved to 
be a limited one under the conditions of this 
experiment, in the sense that the effect was 
not sustained. That is, between-trial influ- 
ence failed to occur on A nor was it mani- 
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fested beyond the first postinfluence judg- 
ment of B. An experiment by Peterson, Saltz- 
stein, and Ebbe (1965), with young male 
adults suggests that the conditions for as- 
sessing social influence in the present experi- 
ment may have been minimally sufficient for 
a between-trial effect. 

They included two experimental conditions. 
In one, the successive estimates to which the 
subject was exposed changed during each 
trial so as to maintain a constant ratio to the 
successive estimates made by the subject 
within that trial. In the other, the estimates 
received by the subject did not change within 
a trial, so that the discrepancy between these 
estimates and the subjects own estimates 
would decrease if the subject revised his own 
estimate upward. There were sizable between- 
trial effects in the former condition, but none 
in the latter. This suggests that the failure 
to obtain more extensive generalization of 
influence effects in the present study may 
have been due to the unchanging character of 
the bogus estimates received by the subject. 

The results by Peterson et al. (1965)* sug- 
gest a possible interpretation of the relation- 
ship between within- and between-trial effects. 
Between-trial change may depend on whether 
within-trial change has succeeded in reducing 
the initial discrepancy between the subject’s 
estimate and that of the peer. If change 
within a trial fails to reduce this discrepancy, 
then the unrelieved pressure may become 
manifest as a between-trial effect as soon as a 
new display is introduced. In the present ex- 
periment the subject received what he be- 
lieved to be the peer's first and second esti- 
mates of Display C. It is to be remembered 
that these two estimates were the same, both 
a constant multiple of the subject's first esti- 
mate. Therefore, any increase.in the subject's 
first to second estimate on C during the influ- 
ence series would effect a reduction in discrep- 
ancy from the peer’s estimate. ‘A more direct 
test of the interpretation of between-trial in- 
fluence as a function of unchanging within- 
trial discrepancy is then to determine what 
the relationship is between the change from 
the first to the second estimate of C (within- 
trial) and between-trial influence on B. If the 


4 Sequential Social Influence Effects,” unpublished . 
manuscript. 
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above assumption is correct, we might expect 
a negative one. Instead, however, we find the 
correlation for boys to be positive and highly 
significant in the private condition (tau — 
.354, p < .006), and positive at a borderline 
level in the public condition (tau = .227, p < 
.07), This suggests that a “trade-off” relation- 
ship did not exist between within- and be- 
tween-trial influence. 

Perhaps the most startling finding of the 
study is the failure of the boys to show any 
immediate effects of exposure to discrepant 
judgments, but marked delayed (between- 
trial) effects. In order to account for this 
“sleeper effect,” it is reasonable to assume 
that some factor was acting to suppress modi- 
fication of the boys’ estimates during the in- 
fluence series, but was removed between the 
influence and postinfluence series and was 
not operative for, girls. 

One factor which may have acted to sup- 
press change during the influence series was 
explicit commitment to one’s original esti- 
mate, This commitment may have been based 
on a general need for consistency of judg- 
ment or in the belief that revising one’s esti- 
mate to agree more closely with the other 
child’s estimate was tantamount to cheating. 
The latter explanation is supported by the 
interview with some of the children, It does 
not explain why the girls show a marked 
within-trial effect since there is no consistent 
evidence that boys are more apt to resist 
temptation than girls (Kohlberg, 1963). It is 
possible, however, that boys were more apt 
than girls to treat the situation as a test situ- 
ation, with a concomitant self-instruction not 
to use information from others. 


EXPERIMENT II 


A second experiment was carried out to 
replicate the findings of the first and investi- 
gate the phenomenon further. 

In Experiment I the social influence effect 
spread from the original display to a new 
display which, although presented as differ- 
ent, in fact contained the same number of 
dots as the original display. In the present 
experiment between-trial influence was tested 
for displays which were larger and smaller 
than the original target. Three displays were 
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judged in one of two orders. The first displa 
always contained 150 dots. The second an 
third displays were varied so that in Condi 
tion 1 the second display was smaller and th 
third display was larger than the initial dij 
play, while in Condition 2 the reverse se 
quence was presented. Half of the subjectsi 
the experimental and control conditions we 
exposed to one stimulus order, and half fi 
the other. 

Failure to obtain within-trial influence fol 
the boys in the earlier experiment was at 
tributed to an unintentionally induced mol 
inhibition against revising one's original judge 
ments, The instructions were revised (see 
below) to remove, or at least reduce, this selfi 
induced set. Our second aim was to see 
whether this change in instructions would in- 
crease the amount of within-trial influence, ~ 

The conditions for testing within-trial and 
between-trial influence were not exactly come 
parable in Experiment I in certain time 
factors which might be crucial. In the ori 
inal study the bogus estimates of the 
were always introduced following the ne3 
exposure of the first display, but before 
subject wrote down his next estimate of 
display. The judgments of the subsequent d 
plays, however, were made immediately fo 
lowing exposure of the display. Thus, the 
time interval between exposure of the display 
and the subject's estimate was different for 
the first display and the subsequent displays” 
and thus for within-trial and between- 
influence, respectively. In Experiment II, this. 
factor was controlled by presenting the bogus 
estimate of the first display before the next: 
exposure of that display. This allowed us, 
a third aim, to examine the relationship be 
tween the two forms of influence with the 
time between judgment and exposure to 
display more precisely controlled. Hy 

When a subject makes a judgment of a 
display, he probably has somewhere in mind” 
a range of possible or acceptable estima! 
the actual estimate elicited lying within 
range. Our fourth aim in the present study was) 
to discover whether the information about 
estimates made by a peer effected a change in 
the child’s choice of a “best” estimate within 
a relatively stable range or a change in 
boundaries of the range itself. 
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Procedure. 'The basic procedure used was the same 
as in Experiment I with the following exceptions: 
AID subjects made their postinfuence estimates in 
«private; the bogus estimates were always introduced 

- before the next exposure of the 150-dot display; the 
displays were projected approximately 12 feet from 
the subject; the control subjects did not receive 
their own (or any other) notes back during the 
influence series. 

As indicated there were two orders of presenta- 

* tion of the displays for the experimental and con- 
trol conditions, The first display always contained 
150 dots. In Condition 1, the second display pre- 
sented during the postinfluence series contained 70 
dots and the third contained 300 dots. In Condition 
2, the order was reversed; that is, the second dis- 
play contained 300 dots and the third contained 70 
dots. In both order conditions the first display was 
labeled A, the second labeled B, and the third labeled 
C; 

Conditions 1 and 2 also differed in the order of 
presentation during the base-line series. The order 
of presentation in Condition 1 was 150-70-300, and 
this cycle was repeated three times. In Condition 2, 
the cycle, 150-300-70, was presented three times. 
This difference was to insure that the displays. ex- 

.posed during the base-line series appeared in alpha- 

. betical order, as they did during the influence and 
postinfluence series. 

After these 18 estimates, the children were shown 
the same three displays again for 5 seconds each and 

"asked to indicate the range of possibly correct esti- 
mates. They did this by looking over a long list of 
estimates and crossing out those they were “sure 
are wrong because they are too high or too low” 
and checking those “that may be right.” For each of 
the three displays the subject was given a new and 
different list of estimates. These three lists were 
constructed by making three different random order- 
ings of a set of base values (5, 25, 50, 75, 95, 125, 
175, 248, 348, 445, 545, 645, 745, 845, 945, 2100) and 
by replacing each value in these orderings with an 

` item that differs from the replaced value by only 

a small amount (less than 10 for all but the largest 
value). Thus, each list had a unique ordering and a 
unique set of items, but all lists contained approxi- 
mately the same values. ] 

Experimental instructions. One of the aims of this 

. second experiment was to test whether the boys 
would show within-trial influence if the instructions 
were designed to eliminate self-instructions equating 

. agreement with the peer and cheating. For this rea- 
son the precise wording of this part of the instruc- 
tions read to the subjects in the experimental condi- 
tions is crucial and is reproduced below: 


Now sometimes it is helpful to k xt kind 
of guesses other people make al out the same 
us sometimes it isn't helpful. 


So we're going to let you look at each other's 
guesses. The other person's guess may help you 
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to make your next guess, or it may not, It de- 
pends on how good you think [his/her] guess is. 


These instructions, of course, were not read to the 
subjects in the control conditions. 

Subjects. Sixty-eight boys and 70 girls participated 
in the experiment. These children were recruited 
from a suburban elementary summer school in the 
Boston area.5 For the main analysis the scores of 
12 boys and 12 girls were eliminated because they 
failed to understand the instructions. This left an 
unequal and disproportionate distribution of sex and 
ages among conditions, so some additional data 
were further eliminated by a random procedure to 
insure an equal number of subjects in all cells. The 
remaining 96 children ranged in age from 7-7 to 11-6. 

Twelve boys and 12 girls were assigned to each of 
the four conditions. Half of the boys and half of the 
girls in each condition were from the third grade 
(7-, 8-, or 9-year-olds) and half were from the fifth 
grade (10- or 11-year-old). Assignment to condition 
was random. 

A number of children did not seem to understand 
the meaning of a "range of reasonable estimates." 
They either only selected one estimate on each of 
the three displays, or made two or more noncontinu- 
ous, and therefore inconsistent, judgments, for ex- 
ample, crossed out an estimate which lay between 
two other estimates which were checked. Their data 
were omitted from analysis of the “range” data, 
leaving 46 boys and 44 girls for this subsidiary 
analysis. © 


Results and Discussion 


Between-trial influence effects. The main 
aim of the present experiment was to con- 
firm between-trial influence using spread 
items that differed from the target item. 
Before proceeding to this analysis, it is nec- 
essary to determine whether there are any 
initial differences between experimental and 
control conditions in the change during the 
base-line (preinfluence) series of judgments. 
None of these preinfluence comparisons 
between conditions approached significance. 

A summary of the differences among condi- 
tions in between-trial influence’ is presented 
in Table 2. In each case the experimental and 
control groups are compared separately for 
Conditions 1 and 2 (ie., for the two orders 
of presentation). The scores compared are 
changes from the last base-line estimate to 


5 The investigators are indebted to the Newton 
Public School System, and in particular Richard 
Adams, Director of the Harvard-Newton Summer 
School, and his staff for their splendid cooperation 
in obtaining subjects and providing space for testing 
the children. 


672 


each of the three postinfluence estimates of 
a display divided by the former estimate. 

As in the previous experiment the boys 
show a clear-cut between-trial influence ef- 
fect. This occurs only on the larger display 
in Condition 2, that is, when this larger 
display (300 dots) is the first of the two post- 
influence (spread) stimulus items to be 
judged. Unlike the previous study, the dif- 
ference between conditions reached the .01 
level of significance on both the initial (first) 
and final (third) postinfluence estimates. 
There is absolutely no effect in Condition 1, 
that is, where the 300 display is the second 
of the two postinfluence displays judged. 
There is also no effect on the 70-dot display 
in either condition, 

The results for the girls are generally simi- 
lar but, as in the earlier study, they fail to 
reach significance, leaving it doubtful whether 
girls of this age and under these conditions 
use knowledge of a peer’s discrepant estimates 
on one item to make judgments of other 
items. But near significance tendencies in the 
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girls’ results include those measures on which [ 
the boys showed their effects, that is, for the 
larger display (300 dots) in Condition 2, 

The most striking aspects of these findings 
are: that between-trial influence occurs only 
on the larger of the two displays when it is’ 
the first display judged after the influence | 
series; unlike the earlier study, the effects on 
judgments of that display are sustained for * 
all three estimates; and once again the phe-. 
nomenon is clearly established only for the 
boys. 

The fact that the influence effect occurred 
only on the display containing 300 dots is 
open to at least two interpretations: that in- 
fluence effects on such judgments will only 
occur when the display to be judged is large | 
and its numerosity highly uncertain, or the | 
carry-over or between-trial effect will occur » 
only when the spread display is larger than ` 
the target (if the bogus estimate is larger 
than the subjects own initial judgment). : 
That is, the difference between the order 
conditions may be due to the absolute or 


A TABLE 2 


BETWEEN-TRIAL AND WITHIN-TRIAL INFLUENCE: MEDIANS BY SEX AND CONDITION AND CHI-SQUARE 
Tests OF DIFFERENCES BETWEEN CONDITIONS BY SEX: EXPERIMENT II 


Measures 
Between-trial on smaller Between-trial on larger 
Main within-trial (70-dot) display (300-dot) display 
(On 150-dot display) 
Initial Final Initial Final 
Boys’ medians 
Order 1 
Experimental .294 A18 244 A15 post 
Control 054 228 .225 .097 092 
Order 2 
Experimental 771 .032 410 877 928 
_ Control 061 0 0 047 196 
Chi-squares 
oea ntal versus control 
rder ns ns ns ns ns, 
Order 2 8.17* e 8.17* 
Girl’s medians aK x at 
Order 1 > 
Experimental 821 152 245 064 053 
Control -060 0 227 0 — .046 
Order 2 
Experimental .626 0 .084 296 -516 
_ Control 0 017 476 0 —.011 
Chi-squares 
meus versus control 
rder ns ns ns 
Order 2 8.17* ns A s ns 
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GENERALIZATION OF SOCIAL INFLUENCE 


TABLE 3 


CORRELATIONS (KENDALL’S TAU) OF WITHIN-TRIAL AND 
BETWEEN-TRIAL INFLUENCE BY DISPLAY, 
CONDITION, AND SEX: EXPERIMENT II 


Condition 
Displays Boys | Girls 
Order Influence 
150 (within-trial) and t Experimental| .534*** | .385** 
70 (between-trial) Control ns | .297* 
2 | Experimental] .492** | ms 
Control ns "s 


150 (within-trial) and 1 Experimental|.626*** | ns 


300 (between-trial) Control ns ns 
2 Experimental|.596*** | ns 
Control ns ns 


Note.—In each case the measure of between-trial influence 
was ased ga the first postinfluence estimate of that display. 
p <10. 
** p <05, 
ep c.l. 


relative stimulus value. There is no way to 
distinguish between these two interpretations 
within the present design. 

The sex difference obtained is consistent 
with the earlier study. It should be noted, 


. however, that in both studies the girls did 


show a possible between-trial effect, but one 
whicli does not reach an acceptable level of 
significance. Therefore, it cannot be claimed 


. that the between-trial effect is restricted to 


boys, but rather that it is only for the boys 
that it has been clearly demonstrated. 

This possible sex difference is not due to 
any difference in the apparent difficulty of 
the task. Two measures of difficulty were 
obtained: the overall actual accuracy of the 
judgments during the base-line series? 
and the interindividual variability of judg- 
ments on a given display during the base-line 
series, In neither case is there any reliable 
difference between boys and girls. 

Within-trial influence effects. In the present 
experiment the instructions were designed to 
remove any self-instruction, namely, that 


* changing one’s judgment to agree with the peer 


was tantamount to cheating. It was expected 


6 Overall accuracy scores for the judgments of the 
three displays were calculated by taking the differ- 
ence between the actual number of dots and the 
median estimates for that sex group on that display. 
There were no differences between the sexes in 
accuracy on any of the three displays. The boys and 
girls both underestimated the 70 display by 30. The 
boys underestimated the 150 display by 67 and the 
girls by 65. The boys underestimated the 300 display 
by 170 and the girls by 175. 
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that this might promote within-trial along 
with between-trial influence effects for the 
boys. This aim was frustrated, however, by 
an unanticipated difference between condi- 
tions in the base-line (or preinfluence) scores. 
The experimental subjects among the boys 
had higher base-line change scores than the 
control subjects in Condition 2 (p < .05). It 
is precisely in this condition that the experi- 
mentals show higher influence scores than the 
controls (p < .01), as may be seen in Table 
2. The difference during the influence series 
may then be due to an initial (unexplained) 
difference between the two groups, in their 
preinfluence estimates. 

The girls’ data are not complicated by 
base-line differences. As in the earlier study, 
they show clear evidence of within-trial influ- 
ence (p < .01). This effect is also restricted 
to Condition 2. In Conditior 1 the trend is 
in the predicted direction for boys and girls, 
but falls far short of an acceptable level of 
significance. 

Correlations of within-trial with between- 
trial effects. The third aim of the experiment 
was to examine how the two forms of 
social effects on judgments, within-trial and 
between-trial influence, relate to one another. 

Tests of association by Kendall’s tau along 
with the significance levels are presented in 
Table 3. The data for the boys exhibit 
moderately strong and highly significant cor- 
relations between within-trial movement on 
the target display and between-trial move- 
ment on both spread displays for subjects in 
both of the order conditions (1 and 2). For 
girls in the experimental conditions the cor- 
relation of within-trial movement to between- 
trial movement on the larger spread display 
(300 dots) only reaches a borderline level of 
significance. There is a significant relationship 
for within-trial change on the target display 
containing 150 dots and between-trial change 
on the smaller of the spread displays (70 
dots), but only for Order Condition 1. 

The two interesting points to be gleaned 
from these results are that the correlations 
are consistently higher for boys than for girls, 
and that the correlations occur in the experi- 
mental and not in the control conditions. This 
latter point suggests that the source of the 
relationship lies in the social rather than the 
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nonsocial aspects of the judgmental task. In 
fact, this pattern of associations may be 
taken as further evidence of between-trial 
influence, The main analysis, reported above, 
relies on gross comparisons between groups. 
The fact that a subject’s judgment on subse- 
quent displays may be predicted from the 
immediate or within-trial change on the initial 
display provides additional evidence that early 
social effects on judgment carry over to 
judgments on new items. 

These data are relevant for another issue: 
whether a necessary condition for between- 
trial influence is the absence of within-trial 
influence. As in Experiment I, the prevalence 
of positive, rather than negative, correlations 
casts doubt on this hypothesis. 

Influence effects on the range of acceptable 
estimates. The fourth aim of the study was 
to determine Whether the exposure to dis- 
crepant judgments on one display affected the 
subject’s range of acceptable judgments on 
that display and on the other two displays. 
These judgments in each case provide an 
upper and lower limit or, bound. 

The differences between the boys in the ex- 
perimental and control conditions parallel the 
results using the child’s “best” estimate. In 
particular, there is an influence effect in their 
judgments of the upper bound of the larger 
(300 dot) of the two spread displays in Order 
Condition 2 (p < .05, two-tailed). There are 
also tendencies (at the .10 level) for the lower 
bound of reasonable estimates of the initial 
(150-dot) display and larger spread display 
to be affected, again only in Condition 2. The 
girls exhibit possible influence effects (p < 
.10) in their judgments of the upper bound 
of the initial (target) display and the larger 
spread display, and the lower bound of the 
smaller (70-dot) spread display, all in Order 
Condition 1. 

It is not completely clear why the girls 
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show possible influence effects in their range — 


of acceptable estimates but not in their best 
estimates. One possibility lies in the fact that 
the range of acceptable estimates was meas- 
ured by the subject's choosing from among 
fixed alternatives. The presence of very high 
estimates may have led the girls to assume 
that the experimenter considered very high 
estimates to be reasonable. 


CONCLUSION 


The present study has confirmed the 
between-trial influence phenomenon with 
children. It has furthermore demonstrated 
greater stability in the effect than in our 
first experiment. The effect, however, appears 
to be constrained by certain stimulus factors. 


It cannot be determined from the present , 
experiment whether these stimulus factors | 


reside in the absolute characteristics of the 
spread display or in the relation between the 
target and spread display. The carry-over of 
social influence effects also appears to change 
the upper and lower boundaries of the range 
of reasonable estimates. 
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PSYCHOPHYSICAL SCALE MATCHING AS A 
PROTOTYPICAL LANGUAGE TASK + 


THORNTON B. ROBY ax» CHARLES R. BUDROSE ? 
Medical Research Council, Cambridge, England 


This study concerned the effects of variations in method of feedback in a task 
requiring 2 Ss to match brightnesses, on the basis of verbal communication 
using a numerical scale which had initially no agreed upon meaning. 2 Ss 
faced opposite sides of a separating partition which displayed 2 light bulbs. 
A trial consisted in E adjusting the brightness of 1 bulb; 1 S, T, viewing the 
bulb and assigning a number to this brightness; the 2nd S, R, adjusting the 
brightness of the other bulb in an effort to achieve the brightness announced 
by T; and viewing of the results of the attempted match by either one or 


both Ss, The main criterion measure, the mean squared difference between the 
2 brightnesses was broken down into linear and nonlinear components, The 


poorest performance occurred with feedback to T, next to TR, and the best 
to R. Nonlinear error was greater than linear error. 


Human communication is based in large 
part on consensus as to the meaning of the 
symbols employed. The usefulness of language 
as a medium of social interaction depends on 
words having a common referential basis for 


. the persons who are using them. The objective 


of the series of experiments of which this is a 
part is to investigate the way in which this 
consensus may be achieved. 

There are several aspects of this problem. 


Certainly the most fundamental one is that 


the dimensions or categories of description 
must match in some way. That is, two persons 
who are using verbal symbols must slice the 
world along the same lines. The fact that we 
do this rather successfully may be due to 
innate tendencies to categorize the world 
similarly or it may be a consequence of the 
fact that similar attributes are of almost 
universal relevance for human adaptation. In 
any case, it is true that this is a very difficult 
problem to get any research leverage on just 
because of the high level of development that 
is observed in ordinary language. The problem 
seems to be closely tied to concept formation, 
and other studies in the present series are 
related to this topic (Roby & Budrose, 1965*). 

1 This research was supported in part by the Office 
of Scientific Research, United States Air Force, under 
Grant 779-65, and in part by the Sperry Rand Research 
Center, Sudbury, Massachusetts. 

? The able assistance of Michael Wade of Kalamazoo 
College and Sharon McHold Lawrence of the University 
of Delaware is gratefully acknowledged. 

3*The Verbal Transmission of Geometric Pattern 
Information," unpublished manuscript. 


A secondary aspect of language use entails 
agreement as to the gradations or subclasses 
within a categorical dimension. For example, 
agreeing that color is a distinguishable attri- 
bute, persons using color names will have to 
decide what will be red, green, and blue. 
Agreeing that size is an important attribute 
of their environment, they must agree on 
what is small, what is middle size, and what is 
large; or, better still, establish units that can 
be used as yardsticks of measurement. 

The present study is aimed at developing a 
methodology for studying these aspects of 
language formation. It is hoped to bring them 
into the laboratory, to attempt to control some 
of the factors that may influence them, and 
to develop measurements which characterize 
the learning process. 

Social learning processes by which these 
agreements are first established and later 
implemented are complex and little under- 
stood. The simplest process, ostensive defini- 
tion, can account for only a small portion of 
referential learning. It must be'supplemented 
by secondary definitions and by more subtle 
learning conditions. The present study is 
aimed at developing a methodology for a 
systematic investigation of language-learning 
processes. The particular emphasis of this 
study is on the effect of various feedback or 
confirmation conditions when direct ostensive 
appeal to the referent is precluded. 

The special “language” of which the genesis 
is here examined is one relating a numerical 
scale to the brightness of physical stimuli. 
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The team task requiring this language is one 
in which an observer, shown an experimentally 
controlled input, attempts to convey its 
brightness in numerical terms to a visually 
isolated partner who is charged with repro- 
ducing the original brightness setting. This 
laboratory situation models a number of real- 
life tasks in which one or more members of a 
team act as scouts or extended sensors for 
other team members who make decisions or 
take adjustive action. It is somewhat atypical 
in that the required action is simply to dupli- 
cate the original environmental observatión. 

A second atypical feature of this task is 
the rudimentary nature of the relevant stim- 
ulus continuum and the associated language 
system. Although this feature may limit the 
generality of substantive findings, it has two 
immediate experimental advantages. First, 
there is enough ambiguity in the description 
of the stimulus referents so that social learning 
can come into play; and, second, the data 
arising from the study lend themselves to 
efficient and intensive analysis. With these 
data it is possible to examine separate aspects 
of the process in detail and to estimate the 
effects of various components in the system. 
Hopefully, it will be possible later to adapt the 
methodological techniques described below to 
more realistic task situations and richer 
languages. 


METHOD 
Procedure 


The input from the environment in the task situation 
was the brightness of a lightbulb adjusted by the 
experimenter, This was viewed by one subject, T, 
who then assigned a number to this brightness. Upon 
hearing this number a second subject, R, adjusted 
another lightbulb to a brightness he thought corre- 
sponded to the number given by T. The number range 
available to T was given to him on a card but not 
announced to R, 

Subjects were given general instructions on the 
nature of the, task and on their objective as a team, 
which was to match, on the bulb controlled by R, the 
setting initially introduced by the experimenter on the 
bulb seen by T. They were also informed of the three 
feedback conditions: in feedback condition T, the lamp 
set by R was shown to T; in feedback condition R, the 
lamp originally seen by T was shown to R; and in 

feedback condition TR, both lamps were shown to both 
subjects. Feedback was given after each trial. 

This was repeated for a set of 12 stimulus settings 
ranging from 18 to 62 milliamperes in 4-milliampere 
steps. The same 12 settings were given two more times 
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Fic. 1. Task apparatus. 


without interruption. For each of these 12 trial blocks 
the ordering of 12 stimuli was a distinct randomization. 

After subjects had completed three trial blocks 
within a given feedback condition, T was given a new 
scale range card, and the new feedback condition was 
announced. All three feedback conditions were com- 
pleted in one experimental session, lasting approxi- 
mately 1.5 hours. 


Apparatus 


"The task apparatus is shown in Figure 1. A dividing 
partition prohibits direct visual contact between two 
subjects serving as partners and also serves as a panel 
for the displays shown to the two partners. Two light 
bulbs are contained in two boxes embedded in this 
partition. On each side of both light bulbs there is a 
shutter which is controlled by the experimenter, 
enabling the subject on that side of the partition to 
view the bulbs or not. 

The current through each bulb is controlled by a 
theostat. One rheostat is mounted on the base of the 
partition support and is controlled by the subject 
designated R. The other rheostat is contained at the 
experimenter's station. The current through the bulbs 
controlled by these two rheostats is measured on a 


pair of ammeters that are visible only to the experi- 
menter. 


Design 


"Table 1 gives the basic design of the experiment. Two 
Latin Squares were used which exhaust the permuta- 
tions of the three feedback conditions. Within each of 
these Latin squares two teams were run under the same 
permutation but with different scale conditions. 

Six scales were used in two sets of three scales. The 
principal variable in the scales was the span of numbers 
covered. For two of the scales the span was 24, for two 
of the scales the span was 36, and for two of the scales 
the span was 48; the numbers always ranged between 


a 
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TABLE 1 


ASSIGNMENT OF TEAMS TO FEEDBACK ORDERS 
AND SCALE CONDITIONS 


Session 
Teams 
1 2 3 
1 T la TR 2a R 3a 
7 T 2a TR 3a R la 
3 TR 3b R tb T 2b 
9 TR 1b R 2b T 3b 
5 R 2a T 3b TR 1b 
10 R 3a T la TR 2b 
2 TR 2a T 3a R la 
8 TR 1a T 2a R 3a 
4 T tb R 2b TR 3b 
11 T 3b R 1b TR 2b 
6 R 3a TR la T 2b 
12 R 2a TR 3b T 1b 


0 and 100.4 In Table 1 the capital letters designate 
feedback conditions, The numbers represent the span 
of numbers covered, 1 = span of 24, 2 = span of 36, 


' and 3 = span of 48. The lowercase letters, a and b, 


represent two distinct sets of stimuli within each span. 


Subjects 


The subjects were 24 college students who had 
volunteered to participate in experiments during the 
summer vacation and who were paid at the rate of 
$1 an hour for participation. Subjects were of both 
sexes, No attempt was made to control for previous 


acquaintance. 
RESULTS 


The data will be examined from several 
standpoints. The first results presented are 
purely descriptive. They offer some compari- 
son with other findings in the literature on 
brightness scaling, and also help justify certain 
statistical procedures employed later. 

A second aspect of the analysis relates to the 


* question of how well subjects performed, to 


what extent they succeeded in transmitting 
the information, and how this varied over the 


experimental conditions. The basic criterion 


of successful performance used in this study 
is mean squared error, which will be analyzed 


4 Due to an administrative oversight the scales were 
not arranged orthogonally with respect to other condi- 
tions. However, analysis tends to show that this does 
not compromise any of the important results of the 
study. 
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rather intensively. However, supplementary 
measures are also examined which convey 
different aspects of the information, namely, 
the product-moment and rank-order correla- 
tions between the initial stimulus values and 
the transmitted stimuli. As noted later, there 
may very well be practical situations in which 
the information carried either by linear cor- 
relation or by ordinal correspondence might 
suffice to insure appropriate task behavior. 

A third question involves the specific break- 
down of error components—that is, how much 
of the error is attributable to the uncertainty 
or error on the part of the individual partici- 
pants as to their own individual scale trans- 
formations, and how much of the error is due 
to disagreement between the subjects as to 
the scale transformation? 


Descriptive Aspects * 


The first analysis treats the data as though 
they had been obtained in a conventional 
psychophysical scaling study. In the present 
case, variants of two of the standard scaling 
operations are employed. Person T on each 
team uses the magnitude estimation" method 
and Person R uses the magnitude production" 
method (Stevens, 1958). A slight change in 
customary procedure is introduced by the 
experimental manipulation of numerical scale 
ranges. 

The results are shown in Figure 2. These 
data are for the third and final trial block, and 
combine the three feedback conditions for all 
subjects. Each of the 12 T subjects contributes 
three estimates for each of the 12 stimulus 
points on the Brightness (B) to Number (N) 
scales. This is T on S in Figure 2. In order to. 
facilitate comparison, the inverse N to B 
scales were constructed by grouping the final 
adjusted brightnesses in 4-milliampere inter- 
vals, and obtaining the means of the associated 
Ns. This is T on R in Figure 2. Thus the points 
in the N to B scales are also based on an 
average of 36 cases each, but there is some 
variation due to uneven incidence of adjusted 
brightnesses in various intervals. 

Two aspects of these results are relevant to 
subsequent analyses. First, the scales for the 
B to N and N to B transformations are very 
similar to each other. This means that, taken 
in the aggregate, the coding and decoding 
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Fro. 2. Plots of mean T measures for the six groups within each scale. (Each S point represents an actual 
stimulus. Each R point represents the midpoint of a range of R responses, T on S is represented by a dotted line; 


T on R, a solid line.) 


transformations that occur are inverse to each 
other: there are no systematic distortions in- 
troduced by inherent dissimilarities in these 
psychophysical transformation operations. 
More directly apropos is the evident linearity 
of the scales over the range here investigated.’ 


* It should be noted that the physical measure used 
here, milliamperes of current, has a decidedly nonlinear 
relation with brightness as conventionally measured, 
We do not present this function because it is suspected 
that the judgments here studied use bulb color in 
addition to “pure” brightness. 


This gives some justification for the device 
adopted below of accepting the best fitting 
straight line—the linear regression equation— 
as representing most of the information in the 
scale transformation. As will be seen, this 
assumption greatly increases the precision of 
analysis that can be applied. 


Components of Transmission Error 


„The central results of this study concern 
discrepancies between the stimulus bright- 


l 
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nesses presented to T and the values finally 
produced by R. Denoting the former by S and 
the latter by R, the basic measure of these 
discrepancies will be the sum, 2; (S — Ry, 
taken across all 12 stimuli in a trial block.5 
The major advantage of this error measure is 
that it permits a partition of the total error 
into meaningful components. 

The analysis of the overall error term 
Y. (S — Ry is described in detail and illus- 
trated in the next section. It results in five 
additive components: 


1. Mean discrepancy represents the differ- 
ence in level between the brightness values 
which Persons T and R associate with identical 
numbers. 

2. Slope discrepancy refers to the difference 
in slopes between the best fitting regression 
lines for T and R. 

These two components, collectively, are 
described as linear error. It should be noted 
that the correlation between S and R is un- 
affected by this part of the discrepancy: put in 
another way, a team can transmit full ordinal 
—or even interval—information on a set of S 
values no matter how large their mean and 
slope discrepancies. The three remaining 
terms constitute the nonlinear component. 

3. T error refers to the deviations of the 
original S values from the best fitting regression 
line relating those values to T’s numerical 
response. These deviations are not solely error, 
of course; they also reflect nonlinearity in 
scaling and deliberate adjustments to feedback. 

4. R error refers to the deviations of R's 
response values from the best fitting regression 
line relating those values to T’s numerical 
response: the same qualification applies as for 
T error. 

5. TR cross-product describes the correlation 
between the above nonlinear components in 


*Other error measures have also been examined, 
including summed absolute errors and various trans- 
formations of the sums of squares. These scores are, 
however, less convenient for manipulations employed 
below than are the squared errors, and the results, 
which corroborate those here obtained, will not be 
reported. 

The distribution of the squared error scores was 
plotted and is not seriously nonnormal either in terms 
of skewness or kurtosis. This would be expected if the 
scores are, as assumed, distributed according to non- 
central chi-square with df = 12. 
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T and R transformations. If both partners 
adopt a similar nonlinear scale function, this 
term takes on a large negative value, canceling 
out part of the effect of nonlinearity. 


Derivation and Illustration of the Error 
Partitioning 


The partitioning procedure is most readily presented 
in two stages. The first stage fractionates the error 
sum of squares into linear and nonlinear components, 
and the second stage reduces these components to the 
five-way breakdown described above. 

This partition uses the assumption that most of the 
information transmitted in the communication process 
is represented by linear scales. Speaking broadly, T 
translates the brightness into numbers by a linear 
regression equation, and R translates the rumbers back 
into brightness adjustments by a second regression 
equation. Both the experimental input S and the 
reproduced output R are first broken up into two 
factors; (S—S) and (S); and (R — &) and (f), 
respectively. Here Ô is the hypothetical value of the 
brightness based on the inverse of the regression 
equation used by T in converting brightness settings 
into numbers, and R is the hypothetical value of the 
reproduced brightness based on the actual regression 
used by R. 

If both persons produced strictly linear scales with- 
out error, $ and R would be the actual input and 
reproduced brightness corresponding to fixed numbers, 
and all transmission error would be due to discrepancies 
between these regression equations. The first level 
partitioning divides the squared error into that at- 
tributable to differences between the linear regression 
lines (linear error) and deviation from linear regression 
(nonlinear error). The relevant equations are: 


(S—R)*=[(S—8)+5— (a) - RT " 
= ($—8)42(S-RL(S—8)—(R-B)] 
4t6-5-(q-R)Y. [1] 


Taking sums across all brightness settings, the middle 
term vanishes,’ leaving, 


D(s—-R= BGA DLS-8)- (R-P [2] 


in which the first term on the right-hand side refers to 
linear discrepancies and the second term to nonlinear 
discrepancies. Both of these parts are then further 
reduced. 

Considering first the linear part, set 


$; = 8 + br (N: — NY [3] 
Ri =B +br N: — Ñ) [4] 


where S and Ñ are mean values, br and br are the 
respective regression slopes, and N; are numerical 


1 The reason for this is that (S — $) and (R — &), 
being deviation values, must be uncorrelated with any 
Jinear set of data and hence with the difference between 
the two sets of linear points, 5 and R. 
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TABLE 2 
ILLUSTRATIVE DATA FOR COMPUTATION OF ERROR COMPONENTS 

—R $ s-$ R R-R $-R 

a) à 8 3 (4) (5) (6) 9) (8) (9) 
74 x 31 23 55.19 —1.19 39.56 —8.56 15.63 
38 53 35 3 39.56 —1.56 33.24 1,76 6.32 
26 38 29 =3 28.40 —2.40 28.72 -28 —.32 
34 42 31 3 31.38 2.62 29.92 1.08 1.46 
62 19 48 14 58.92 3.08 41.07 6.93 17.85 
42 44 32 10 32.86 9.14 30.52 1.48 2.34 
18 34 26 —8 25.42 —1.42 21.51 —1.51 —2.09 
50 76 39 11 56.69 —0.69 40.17 —1.17 16.52 
58 78 41 17 58.18 —48 40.76 24 17.42 
30 37 28 2 27.65 2.35 28.42 —.42 = 
22 35 27 =$ 26.16 —4.16 27.81 —.81 —1.65 
46 53 34 12 39.56 6.44 33.24 A 6.32 
a |. 480 643 401 79 480.00 -00 401.00 .00 79.00 
NU ES 38,089 — 1,499 = 273.65 — 132.95 1,234.56 


* Sums shown do not reflect rounding errors in the computations. 


values used by T. Then 


Z($—Ry- ZSR) brbn) (,—N) 
= Za(S—Hy- (br—bx)? Z (—NY. [5] 


Here the left-hand term represents differences in level 
and the right-hand represents differences in slope. 

The nonlinear component breaks up into three parts 
as follows: D 


zts5-$-u-&ry 
2Z(s-$»-2z(s-S(R—R). [6] 


"The first term on the right-hand side represents Person 
T* “error” in translating the stimulus brightness into 
numbers. The second term is R* error in making the 
inverse (presumably linear) transformation. The final 
cross-product term indicates the degree to which T and 
R agree on points at which they depart from linearity. 
"This final term, therefore, gives us an explicit check on 
whether the assumption is warranted that most of the 
transmitted information is linear. 

In order to illustrate the computational procedure, 
the data and derived measures for a representative 
trial block are given in Table 2. Column 1 contains the 
stimulus brightness values presented to Person T; 
Column 2 contains the numbers that T assigns to those 
brightness values; and Column 3 contains the bright- 
ness adjustments assigned to those numbers by Person 
R. Column 4 indicates the algebraic discrepancies be- 
tween presented and transmitted brightness values: 
the summed squares of these discrepancies, shown at 
the bottom of the column, is the basic error score which 
is to be reduced to components. 

Column 5 contains the predicted S values, based on 
the regression equation 5; — .11--.744 N;, and 
Column 6 lists discrepancies between these values and 
the stimulus values actually presented. Columns 7 and 

8 present the corresponding data for the predicted R 
values, based on the regression equation Ê; = 1727 
+ 301 N;. 


Column 9 contains deviations between the predicted 
values in Columns 5 and 7, and the summed squares 
of these quantities represents the linear component of 
error variance. Equation 5 is used to reduce this to 
two terms. The first is attributed to differences in level 
and is given by (480 — 401)?/12 = 520.08. The second 
part is attributable to slope differences and is given 
by 12(.744 — .301)?/(12 X 38089 — 6432) = 713.38. 
The sum of these terms 1233.46 agrees, except for a 
small rounding error, with the directly calculated 
sum, 1234.56. 

Subtracting the linear term from the overall summed 
squares for discrepancies, one obtains a total nonlinear 
error term equal to 1499 — 1234.56 — 264.44 which is 
reduced to three components by Equation 6. The sums 
of squares due to T and R are, respectively, 273.65 
and 132.95, adding to 406.60. From this is subtracted 
twice the sum of cross-products between Columns 6 
and 8 which equal 70.74 (corresponding to a correlation 
of .37). The resulting net sum of squares for nonlinear 
factors equals 265.13 which again matches the above 
value as obtained by subtraction. 


Error Components under Various Conditions 


Analyses of variance were performed on the 
experimental data with error components, 
teams, feedback conditions, test sessions, and 
trial blocks as the main axes. Table 3 summar- 
izes the important results for the two-way 
(linear versus nonlinear) component analysis. 
All higher order interactions have been pooled. 
This provides a rather conservative error term, 
as it includes between-team variance which 
does not enter into principal experimental 

8 The computations required for these analyses were 


performed at the Massachusetts Institute of Tech- 
nology Computation Center, 
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comparisons. The comparisons of chief sub- 
stantive interest concern the main effects and 
interactions due to error components, feedback 
conditions, and trial blocks. The relevant 
subclass means are given in Table 4. Each 
mean in this table is based upon the perform- 
ance of all 12 teams. 

The first result to be noted is that nonlinear 
error factors are significantly greater than 
linear factors. Roughly speaking, this can be 
interpreted to mean that more of the total 
transmission error is accounted for by indi- 
vidual departure from a straight line regression 
equation than by the differences between the 
regression equations used by the two subjects. 

The second main effect, “replications,” is an 
estimate of team differences variance. It also 
reflects interaction between feedback condi- 
tions and the specific scales, but other evidence 
suggests that these are of negligible impor- 


'.. tance. Terms containing Feedback X Sessions 


interaction (C X D) may also be regarded as 
estimating between-team differences. Three of 


. these terms—C X D, A X C X D, and B X C 


X Dare significant. 
The principal experimental effect is that due 
to the feedback conditions, T, R, and TR. 


-¢ tests indicate that the differences between 


feedback conditions T and R, and T and TR 
are significant at the .01 level and the differ- 
ence between R and TR is not significant. 
It will be noted in Table 4 that poorest per- 
formance occurs in Condition T, next poorest 
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TABLE 3 


ANALYSIS OF VARIANCE OF THE SQUARED 
Error COMPONENTS 


Denomi- 
Source af MS F mater 
Linear-nonlinear (A) 1 | 4,027,296 | 32.03*** | (Residual) 
Replications (B) 3 358,615 = (B XC XD) 
Feedback (C) 2 797,251 6.34% | (Residual) 
Sessions (D) 2 580,927 «p CXD) 
ur (E) E 1,287,454 | 10.24% tResidual) 
AXC 2 | 1,002,795 — (A XC XD) 
AXD 2 | 1,336,419 — 
PERS E (AXCXD) 
BXC 6 
BXD 6 
Hs : 
4 | 1,090,688 .639* 
EXP z 2.63 (BXCXD) 
AXBXC 6 
x 6 304,695 2.42** i 
AXBXD $ 2** 4 (Residual) 
Asse | 
4 537,863 | 4.28*** Si 
AXCXB ^ 8 (Residual) 
4 308,874 | 2.46** |(Residual) 
BXCXD 12 414,612 | 3.30*** | (Resi 
BXCAD d (Residual) 
BXDXE 12 
CXDXE 8 
Residual 92 | 125710| , 
*p <.10. 
**p c.05. 
wD c .01. 


in the double feedback Condition TR, and 
best performance in* Condition R, the latter 
difference being very slight. This ordering 
confirms the results of two previous unreported 
studies. 

The effect due to sessions, as already noted, 
seems to be due entirely to the reduction in the 
linear component between the first and suc- 
ceeding sessions. The final main effect, “trial 


TABLE 4 
SQUARED ERROR COMPONENTS BY FEEDBACK CONDITION AND TRIAL BLOCK 
M Slope | Total linear T R Ra. Mlb et ee Total 
T 
1 763.0 242.8 1006.0 385.1 477.0 19.9 883.0 1889.0 
2 357.3 149.4 506.9 416.9 317.3 —154.8 579.5 * 1086.4 
3 235.5 116.1 351.8 390.4 282.8 —80.0 593.6 945.4 
M 451.9 169.4 621.6 397.5 359.0 —11.6 685.4 1307.0 
1 

1 247.5 175.5 423.2 256.4 533.6 —105.6 685.0 1108.2 
2 237.3 169.4 407.1 267.9 361.8 —149.3 480.8 887.9 
3 165.9 68.0 234.4 307.7 364.2 —93.4 578.6 813.0 
M 216.9 137.6 354.9 277.4 419.9 —116.1 581.5 936.4 

TR 
1 26.8 155.4 182.4 307.2 674.1 —47.1 934.3 1116.7 
2 226.8 26.0 253.1 295.5 325.6 —37.6 584.0 837.1 
3 165.6 28.4 194.3 271.8 439.7 —13.7 698.2 892.5 
M 139.7 69.9 209.9 291.5 479.8 —32.8 738.9 948.8 
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blocks," is described by a decided reduction in 
error between the first and second exposures 
to each stimulus set: the means are 685.65, 
468.57, and 441.83, respectively. As shown in 
Table 4 this effect varies over feedback condi- 
tions: both T and R conditions show a more 
definite learning effect than does the TR 
condition. The mean square for interaction, 
C X E, is not quite significant at the .05 level. 

The interaction between linear-nonlinear and 
feedback conditions (A X C) is seen in Table 4 
to hinge on the fact that the nonlinear com- 
ponent is relatively greater for the R and TR 
conditions than for the T condition. It should 
also be nóted that the nonlinear component is 
appreciably greater in absolute magnitude in 
the TR condition than for the single feedback 
conditions. 

Going beyond the two-way breakdown of 
the transmission error, several observations 
apply to the secondary partition of linear and 
nonlinear components. For the former, it is 
evident that mean differences—differences in 
level—account for a greater share of the 
transmission error than do differences in slope. 
There is only one exception to this in the nine 
subclass means of Table 4, and a similar uni- 
formity appears in the individual team data. 
It also appears to be the case that slope error 
decreases more regularly than level error, but 
there is no statistical verification of this latter 
point. 

Considering the partition of the nonlinear 
component, it is apparent that the R error— 
associated with the number to brightness 
transformation—tends to account for more of 
the total error than does the T error. There are 
only two exceptions to this in the nine sub- 
class means shown here, and the following 
secondary analysis supports the generalization 
even more strongly. 

For each of the 12 teams, under each of the 
three experimental conditions, both product- 
moment and rank-order correlations were com- 
puted between original stimulus magnitude 
and transmitted numbers, and between the 
latter and the reproduced brightness. The 
product-moment correlation between the pre- 
sented brightness and transmitted numbers is 
larger than the latter correlation in 76 out of 

96 cases, and the rank-order correlations differ 
in the same direction in 86 out of 96 cases. 
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Thus the “decoding” transformations not only 
depart from linearity more but also exhibit 
more actual inversions.’ 

Returning to the squared error results, it 
should also be noted that there is a reversal in . 
the magnitude of the T and R components 
between the T and R feedback conditions: the 
R component is relatively larger in the R feed- 
back condition and the T component is rela- 
tively larger in the T feedback condition. This 
indicates that these departures from a linear 
scale arise in part from attempts to adjust to 
the experimental feedback. The TR feedback - 
condition produces the largest errors for both 
subjects, which is presumably related to the 
fact that both persons are making adjustments. 

Finally, it is evident that the departures from 
linear regression of the two partners tend to 
be positively correlated, reducing the total 
contributions of nonlinearity to transmission 
error. This compensatory effect is rather 
small, however, and it does not appear to 
increase steadily over trial blocks. This result 
tends to support the contention that most of 
the information transmitted between partners 
in this task is accounted for by the congruence 
of linear regression lines. 


Correlational Criteria. 


There are task situations in which it is not 
essential for different team members to agree 
exactly on the meaning of intervals within a 
Scale, but only to preserve an ordinal or 
interval relationship. To use an example 
adapted from Wittgenstein (1953), a journey- 
man asking his apprentice to fetch nails for 
him may not at first instruct the apprentice 
in "penny" units, All that is required is that 
the apprentice bring the largest nails, second 
largest nails, etc., of a finite assortment. 

This rather forced example indicates the 
televance of more “lenient” criteria than , 
examined above. That is, social dyads may 
differ greatly in the absolute magnitudes and 
slopes of scale gradations applied to a given : 
Property, yet still convey to each other the 
information required for successful cooperative 
behavior. 


* It should be noted that, although simple depart 
should be f ugh simple departure 
from linearity in the individual scale transformations 
may be compensatory between teammates, the rank- 
order inversions are very unlikely to be compensated. 


SCALE MATCHING 


TABLE 5 


PRODUCT-MOMENT (r) AND RANK-ORDER (p) 
CORRELATIONS BETWEEN PRESENTED AND 
REPRODUCED BRIGHTNESS VALUES 


Feedback conditions 


ne T R TR 

r " r p r p 
il 748 | .778 | .799 | .801 | .760 | .761 
7 858 | .833 | .855 | .881 | .870 | .872 
3 855 | .855 | .883 | .900 | .854 | .859 


The two indexes here employed to test 
transmitted information in this more liberal 
sense were the product-moment and rank-order 
correlations between the presented stimuli and 
reproduced brightness. The results, in Table 5, 
are subclass means for the three feedback 
conditions on each of the three trial blocks. 
As these data are intended for supplementary 
descriptive purposes only, the correlations were 
averaged without transformation and no sta- 
tistical tests have been used. 

It is clear by inspection that the feedback 


conditions are ordered in the same way for 


these criteria as for the squared error measures. 
In this case, the TR condition appears to be 
closer to the best condition, R, than to the 
poorest condition, T. All three conditions show 
marked improvement between the first trial 
block and the two final trial blocks. Finally, 
the two types of correlation give very similar 
results: this would be expected from earlier 
indications that individual scale transforma- 
tions in each direction are essentially linear. 
Further analysis of these correlations, and 
their relation with individual scaling correla- 
tions mentioned earlier, did not uncover any 
results beyond those already found in the 
Squared error measures. 

Finally, an attempt was made to trace 
changes in the individual regression equations 
over trial blocks. The intention of this analysis 
was to obtain a detailed picture of the adapta- 
tion of team members to various forms of 
feedback. It became apparent, however, that 
this process occurs too rapidly to be reflected 
in an analysis of this type: indicated modifica- 
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tions in experimental procedure are suggested 
below. 


Discussion 


The principal result of this study concerns 
the ordering of the experimental feedback 
conditions which, as mentioned earlier, con- 
firms previous studies. The finding is that 
feedback to the receiver, R, produces best 
performance; then feedback to both the trans- 
mitter, T, and receiver, R; and poorest per- 
formance occurs in the condition in which only 
T receives feedback. An examination of the 
component parts of the adjustment process 
supplies an intuitive basis for this ordering. 

Table 6 contains a comparison of hypo- 
thetical adjustments under each feedback 
condition in the simple case in which the 
stimulus presentation is repeated for two trials. 
On the first trial it is supposed that the trans- 
mitted or reproduced brightness, 20 milli- 
amperes, is considerably lower than the 
presented brightness. 

For the R feedback condition, all that is 
necessary is for R to adjust his setting on the 
second trial to matth the feedback brightness 
on the first trial for the same number. In the 
example it is assumed that both T and R 
exhibit some nonvoluntary variation in dupli- 
cating these values. The consequent setting 
is only 2 milliamperes from the standard 
brightness. 

In the T feedback condition it is necessary 
for T to translate the discrepancy between 
presented and reproduced brightness into nu- 
merical terms and then to estimate the increase 
required in order to get R to match the pre- 
sented value. It is assumed in the example that 
R’s regression is flatter than T’s, so the ad- 
justed value is still quite low. 


TABLE 6 


HYPOTHETICAL ADJUSTMENTS IN THE VARIOUS 
FEEDBACK CONDITIONS 


Presented Transmitted | Reproduced 
brightness* number brightnesss 
Original 30 60 20 
R feedback 30 58 32 
T feedback 30 80 25 
TR feedback 30 80 40 
a In milliamperes. 
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For the TR feedback condition, both persons 
may make adjustments of the kind described. 
This results in an overcompensation for the 
original discrepancy. On the trial following 
that, there may be an overcompensation in the 
opposite direction, one of the partners may 
elect to "sit tight,” or both partners may “sit 
tight." Only in the second case will there be an 
improvement, and an oscillatory process is 
more likely to occur. 

Unfortunately, although the mechanics sug- 
gested here are supported by discussion with 
subjects and by an impressionistic study of the 
data, it does not seem possible to obtain more 
definitive evidence at this stage of investiga- 
tion. Aside from the fact that learning is very 
rapid, there is an additional complication 
introduced by the variation of presented 
brightness from trial to trial. 

A second clear result of this study is the 
comparatively greater contribution of non- 
linear scaling error than linear differences to 
the gross discrepancy scores. The former com- 
ponent reflects both deliberate changes of 
scale transformation to adjust to feedback and 
nondeliberate unreliability’ of the transforma- 
tions both from presented brightness to 
number and from number back to reproduced 
brightnesses. It should be observed that the 
particular psychophysical task used in this 
study is almost certainly a determinant of the 
relative magnitude of those component error 
terms. However, without some degree of in- 
herent unreliability, the matching process is 
almost trivial. 

Granting that the substantive results here 
noted require corroboration on a broader range 
of tasks, it appears that much of the method- 
ology of this study can be extended without 
essential change to new situations. The basic 
experimental setup, involving the encoding and 
decoding of some environmental referent by 
Separate persons with controlled feedback, can 
certainly be adapted to a variety of subject 
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matters. The analytical techniques here used, 
and particularly the procedure for partitioning 
error, can be used for scale-matching tasks in 
which the scales are linear or can be converted 
to linear form by a coordinate transformation. 

The chief methodological weakness of the 
present study is the lack of evidence on the 
exact mechanics of the learning process. The 
transition from confusion to surprisingly good 
performance occurs very abruptly, and further 
practice results in little further improvement. 
The most plausible diagnosis of this difficulty 
was that the feedback subjects received—side 
by side display of presented and reproduced 
brightnesses—made the learning task too easy. 
Ongoing research attempts ‘to moderate the 
learning rate by giving subjects only partial 
feedback. 

Finally, it should be made clear that this 
laboratory situation is not represented as a 
general paradigm for language learning. It 
appears to offer a very good analog for 
certain aspects of both individual and social 
language modifications—those entailing grada- 
tions or units within a fixed dimensional scheme 
—but it does not, in its present form, capture 
the more important processes by which the 
dimensional schemata are developed. Neither 
does it offer any obvious approach to the vast 
problem of how “assigns” or secondary defini- 
tions are established. It may, however, be 
hoped that, as the methodology for studying 
simple language situations and the theoretical 
analysis. of more general language processes 
both develop, some point of rapprochement 
will ultimately be found. 
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AGGRESSION THEMES IN A BINOCULAR 
RIVALRY SITUATION ' 


MARV MOORE? 


Michigan State University 


This study explored the differential effects of sex role and age on the per- 
ception of violence. Since girls learn to be less aggressive than boys, they 
should perceive less violence than boys; there should be a relation between 
age and perception of violence. In the binocular rivalry situation, a "violent" 
picture was tachistoscopically presented to 1 eye simultaneously with a “non- 
violent" picture shown to the other. Each S saw 6 such pairs of slides shown 
twice in random order. A “violence” score was computed for each person 
based on the number of violent pictures seen. Ss were 15 males and 15 females 
from each of the 3rd, 5th, 7th, 9th, and 11th grades, and college freshmen. 
Results confirmed both hypotheses; males perceived significantly more violence, 
and the increase of violence perception was linearly related to age for both 


sexes. 


Several recent experimenters have studied 
the effects of presenting differentially mean- 
ingful figures simultaneously to both eyes by 
means of a stereoscope. Engel (1956) showed 
that a more familiar figure (an upright face) 
will predominate in binocular rivalry over a 
less familiar figure (an inverted face). Using 
postage stamps with busts of persons such 
as John Adams as stimuli, Hastorf and Myro 
(1959) confirmed Engel’s results. Bagby 
(1959) paired photographs of “Mexican” 
scenes and presented them to Mexican and 
American subjects. Cultural familiarity 
tended to determine which picture any given 
subject saw; whereas Mexicans tended to see 
the Mexican scenes, Americans more often 
reported the American scenes. 

Toch and Schulte (1961) paired violent 
and neutral scenes in the stereoscope; sub- 
jects with 3 years of “Police Administration” 
training saw significantly more violent picto- 
grams than a matched group of liberal arts 
students and a group entering the police 
training program. Shelley and Toch (1962) 
studied a group of institutionalized offenders 


1 This research was done as a master’s thesis and 
supported by Grant F1-MH-20,820 from the Na- 
tional Institute of Mental Health, United States 
Public Health Service. 

?I would like to thank Charles Hanley for so 
helpfully serving as my thesis supervisor and Hans 
Toch for the use of his stereoscope while he was 
on sabbatical leave. I especially wish to thank them 
for their aid in editing and preparing this manu- 
script for publication. 


using the same slides; they concluded that 
the tendency to perceive violepce in the stereo- 
scope was diagnostic of a tendency to behave 
in a troublesome manner. Berg and Toch 
(1964) showed prison inmates pictures in- 
cluding drives other than aggression (oral 
and sexual); each pair of stereoscope slides 
contained a “blatant” and a “socialized” form 
of drive expression. They confirmed Shelley 
and Toch’s use of the stereoscope as a diag- 
nostic indicator of impulsive behavior and 
were also able to discriminate between in- 
mates classified by means of other psycho 
diagnostic measures as either “impulsive” oi 
“neurotic.” 

The studies described above tested implica 
tions of the general hypothesis that specific 
past experiences acquired under particular 
conditions or training sensitize a person to 
related content in the binocular rivalry situa- 
tion. Elevated perception of violence, in par- 
ticular, results from training which exposes 
one to violent material and in certain 
populations is positively correlated with the 
tendency towards aggressive acting out. 

The present study investigates: the effects 
of differential socialization of the sexes on 
the perception of violence, and the relation 
of age to the perception of violence. 

Clearly, males and females in Western 
cultures learn specific sex roles. One aspect 
of this socialization process where there is 
a noticeable difference between the sexes is 
the expression of aggressive behavior. Males 
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Fic. 1. Stereogram pair Number 1 used in 
this experiment. 


learn to be more active, overtly aggressive, 
and socially assertive than females. Various 
investigations have both documented these 
sex differences (e.g, MacFarlane, Allen, & 
Honsik, 1954) and demonstrated how these 
differences are acquired (e.g, Bandura & 
Walters, 1964). 

Furthermore, the socialization of sex roles 
is a gradual and continual process from child- 
hood through adulthood; the American young- 
ster does not master the social expression of 
his drives in a day. It follows that sex-role 
training in the expression and control of 
aggression should differentially sensitize males 
and females to related content in the bin- 
ocular rivalry situation; and the amount 
of sensitization should vary in some way 
with age. 

Hypothesis 1. When presented with a 
paired series of violent/nonviolent stereo- 
grams in the binocular rivalry situation, males 
see more of the violent slides than females 
of the same age. 

Hypothesis 2. When presented with a 
paired series of violent/nonviolent stereo- 
grams in the binocular rivalry situation, dif- 
ferent age groups perceive different amounts 
of violence, regardless of sex, 


MertHop 
Subjects 


The subjects came from two sources. Some were 
drawn from three schools in the Waverly School 
District, a middle-class suburb near Lansing, Michi- 
gan: 15 boys and 15 girls were drawn from each 
of five grades (3, 5, 7, 9, and 11—ages 8, 10, 12, 14, 
and 16, respectively). Fifteen male and 15 female 
18-year-old freshmen were obtained from intro- 


ductory psychology classes at Michigan State Uni- 
versity. 


Apparatus 


The apparatus was a modified stereoscope designed 
by Engel (1956). In the present experiment, light 


intensity was .2 candles/ft? for both fields. An 
interval timer attached to the stereoscope set exposure 
time of stimulus figures at .5 second throughout the 
study. 


Stimulus Figures 


Toch and Schulte’s (1961) pairs of slides were 
used. Each “violent” slide was matched with a 
“nonviolent” slide of similar size and outline, cover- 
ing roughly the same part of the visual field. See 
Figure 1 for an example. Content of the stereogram 
pairs was as follows: 


1. Mailman-knifed man. 

2, Man with suitcase-hanged man. 

3. Farmer pushing plow-man with gun stand- 
ing over dead person. 

4. Man standing at microphone-man shooting 
self in head, 

5. Man at drill press-man knifing another man. 

6. Man showing another man a picture-man 
shooting another man. 


Experimental Procedure 


Subjects of both sexes in all six groups saw the 
six pairs of violent/nonviolent slides twice. On the 
second viewing the slide presented first on the right 
side was changed to the left, and vice versa. Thus 
each eye was exposed to all the possible figures. 
Order of presentation was randomized for each 
subject to control for any series effects. 

To check whether the subjects reported what they 
actually saw, a pair of “lie slides” (two identical 
Pictures of blatant violence) was presented after 
the 12 violent/nonviolent exposures. Subjects who 
do not honestly report their perceptions should give 
a nonviolent description of the thirteenth pair. A 
few subjects in all age groups failed to give an 
accurate description of these lie slides; their scores 
Were not used in the analysis, and enough other 
subjects were sampled in the needed groups to 
make the total usable (V = 180). 

Subjects who needed glasses wore them. The 
experimenter dismissed a few subjects who had for- 
gotten their glasses, as well as subjects who planned 
jo obtain glasses because of a known visual prob- 
lem. 

After adjusting the stereoscope for optimal fusion, 
the experimenter gave subjects the following in- 
structions: 


When I put the top down look into the eye- 
piece with both eyes open. You will see a picture 
flash on for a very short time. After you see the 
picture look away from the eyepiece, then tell me 
all you can about the picture; describe whatever 
you see. There are no wrong answers. 


If at any point an inattentive subject stated that a 
pair of slides “just didn't make sense,” he was told 
to “look carefully” and allowed a second trial; this 
seldom happened. 
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Fic. 2. The perception of violence as a function 
of sex and age. 


Verbatim responses were scored according to the 
standards below; Slide Pair 1 (see Figure 1) is used 
for the example in every scoring category. 


Points Description 


2 Clearly the violent stereogram is described 
by the subject, for example, “A man with a 
knife in his back.” 

1 Fusion is described with a sensible percept 
including violent content (a compromise re- 
sponse), for example, “A mailman with a 
knife in his back.” 

1 Clearly the violent stereogram is described, 

+ but not in violent terms (a compromise re- 
sponse), for example, “A man with arms out 
in front and a stick out in the back." 

O Clearly the nonviolent stereogram is de- 
scribed by the subject, for example, “A mail- 
man with pouch and letter in his hand.” 

O Fusion is described with a sensible or in- 
comprehensible percept, but mot including 
violent content, for example, “A man running 
with his arms going in all directions." 


Thus a subject could obtain a score from 0 (violence 
never reported) to 24 (violent slide reported on all 
12 trials). Actual scores ranged from 0-11. Two 
persons independently scored all responses according 
to the above standards. The Pearson y between the 
two sets of total violence scores of subjects was 
«98. 


RESULTS 


The raw violence scores were transformed 
logarithmically (Base 10) to eliminate the 
correlation. between grade means and vari- 
ances. Figure 2 shows the mean transformed 
violence scores? for each grade tested. First, 
there was a difference between sexes at every 
grade level, and across grades the sex differ- 
ence was significant at the .01 level (F — 
15.97, df — 1/168). Second, the amount of 
violence perceived increased in a linear fash- 


3 Use of raw scores gives the same results. 
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ion for both sexes as age increased; both 
linear trends were significant at the .01 level 
(males, F = 14.78, df = 1/84; females, F = 
14.09, df = 1/84). 


Discussion 


Both experimental hypotheses were con- 
firmed. In the binocular rivalry situation 
males perceived significantly more violence 
than females over the grades sampled (3, 5, 
7, 9, 11, and freshmen), and the increased 
sensitization to violence with age was demon- 
strated by statistically significant linear trends 
for both sexes. 

One way to explain the sex differences in 
perceptual orientation to the world lies in 
learning theory, As Bandura and Walters 
(1964) have illustrated, such personality re- 
sponses as the expression of dependency and 
aggression may be learned. For example, in 
gaining aggression control the child learns to 
discriminate object, form, and intensity ap- 
propriateness for expression of his anger. Such 
discriminations are learned and predictable 
under various schedules of positive and nega- 
tive reinforcement and under conditions fos- 
tering imitation and modeling of significant 
others, Some of the most important discrimi- 
nations that the child learns are the socially 
reinforced differences between the sexes in 
the expression of aggressive behavior, We are 
especially interested in the fact that males are 
taught, directly and indirectly, to be more 
overtly aggressive and assertive than females. 
Such sex-role training, as discussed by Ban- 
dura and Walters, could account for the dif- 
ferences in the perception of violence between 
sexes found at all ages in this study. 

Why under the circumstances of this ex- 
periment does perception of violence increase 
regularly with age for both sexes? One in- 
terpretation is that subjects can, as a func- 
tion of age, discriminate more clearly the 
vague features of the stimulus and/or verbal- 
ize better what they see. An analysis of sub- 
jects’ responses to Slide Pair 13 (lie slides) in 
which identical stereograms were presented 
to each eye simultaneously provides negative 
evidence for the accuracy hypothesis. Table 
1 shows the number of subjects accurately 
describing the lie slides for each grade tested. 
A cursory glance will convince the reader that 
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TABLE 1 


Comparison OF SUBJECTS ON DESCRIPTION ACCURACY 
FOR SLIDE Parr 13 (Lie Sue) 


Number of subjects out of 15 
(group N) who described 
accurately Slide Pair 13 


Grade 
Males Females 

3 14 13 

S 13 14 

7 15 12 

9 13 14 

11 15 12 
Freshmen 14 13 


there were no systematic or significant dif- 
ferences in accuracy between age (grade) 
levels. 

Assuming the accuracy hypothesis to be 
unverified, how is the linear age effect to be 
explained? Previous research suggests that 
the binocular rivalry technique is a direct 
measure of aggressiveness (Berg & Toch, 
1964; Shelley & Toch, 1962). Assuming this 
knowledge to be true, we may conclude in 
the present study that as persons age they 
display more aggression or perceive the world 
as a more hostile place. As mentioned in re- 
lation to the sex differences in violence per- 
ception, socialization involves learning when 
and how to aggress. For example, children 
gain with age better control of anger, while at 
the same time they are taught to be competi- 
tive, and the older they are the greater the 
repertoire of responses approved as means of 
competition. In other words, aggression ex- 
pression gets socialized with age from overt 
anger to more culturally reinforced avenues, 
particularly means of competition, which are 
instrumental to social adaptation, It seems, 
therefore, plausible that the developmental 
trends in this study are a reflection of cul- 
turally sanctioned "instrumental aggressive- 
ness" increasing with age, 
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From the point of view presented in this 
paper socialization is similar to the process 
underlying the fact that advanced police-ad- 
ministration students reported seeing more 
violent stereograms than novices in the same ' 
training program (Toch & Schulte, 1961)— 
a process of education into the policeman’s 
reality. As children mature they are educated 
into a reality that is a slow motion facsimile 
of the police training situation—a reality 
where they become increasingly familiar with 
the abundance of aggression loose in the 
world, Finally then, the sex and developmental 
differences found in this study represent 
learned variant sensitivities of males and 
females to aggressive situations and feelings 
which they must know about to operate effec- 
tively within a social and cultural context. 
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SOCIAL DRINKING, ANXIETY, AND DEPRESSION * 


ALLAN F. 


WILLIAMS ? 


Harvard University 


5 stag cocktail parties were held in which a total of 91 students from 2 colleges 


participated. These Ss completed a 


problem-drinking scale—interpreted as 


measuring proneness to alcoholism—and anxiety and depression adjective check 
lists which were given before the party, after 4 oz., and at the end of the 
party, Preparty results indicated that problem drinking was positively as- 
sociated with anxiety (p < .005), depression (p < .05), and amount of alcohol 
consumed (p<.005). Anxiety and depression decreased significantly at low 
levels of alcohol consumption (from 4-6 oz., generally). At 8 oz. and above, 
these changes were reversed, as anxiety and depression increased, rising nearly 
to base-line (preparty) levels. Problem drinkers neither increased nor decreased 
more than nonproblem drinkers on these variables. 


The present study was undertaken with two 
purposes in mind: to investigate psychological 


_ reasons for normal social drinking; and to at- 


tempt to determine why some people drink fre- 
quently and excessively and perhaps eventually 


_ become alcoholics. 


Although it is recognized that people drink 
in order to “feel differently,” there has been 
little research on emotional effects of alcohol; 
the bulk of current knowledge, or rather lore, 


* in this area derives from clinical and introspec- 


tive reports, Clearly, systematic investigation of 
emotional effects of alcohol is needed, with at- 
tention focused on variables which would seem 
to be of importance: personality, amount of alco- 
hol consumed, and the situation in which it is 
consumed. 

Situational influences on the effects of alcohol 
have recently been emphasized by Kalin, Mc- 
Clelland, and Kahn (1965). Most previous re- 
search in this area has attempted to isolate the 
effects of alcohol per se. Kalin et al. argue that 
this goal cannot be attained: that results in 
psychological research are always a product of 
the interaction between the experimental manipu- 
lation and the set and setting. Moreover, these 
investigators point out that the typical experi- 
mental situation in alcohol research (laboratory 
or hospital setting, intravenous administration) 


1 This investigation was supported by grants from 
the Social Relations Laboratory, Harvard University, 
and from the Division of Alcoholism, Massachusetts 
Department of Public Health. The research was 
carried out under the auspices of the Division of 
Alcoholism, through support of NIMH Training 
Grant MH-7460. 

2Now with the Division of Alcoholism, Massa- 
chusetts Department of Public Health. 


is likely to be inhibiting and anxiety arousing. 
Thus comparing alcohol and placebo subjects 
does not isolate the effects of alcohol per se: 
it isolates the effect of alcohol in an anxiety- 
arousing atmosphere. This type of atmosphere is 
likely to inhibit any “positive” emotional ef- 
fects which may occur under more normal drink- 
ing conditions. Since Kalin et al. were inter- 
ested in why people drink, in their own research 
they studied the effects of social drinking on 
fantasy, allowing people to drink as they nor- 
mally would, in natural party settings in which 
a choice of drinks was available for ad-lib 
consumption. The present study follows this ori- 
entation, studying the effects of male social 
drinking on anxiety and depression.* 

An investigation of emotional effects of social 
drinking may help to account for the widespread 
use of alcohol; it may also furnish clues as to 
why some people regularly drink to excess and 
eventually become alcoholics. To explore this 
possibility, high and low scorers on a college 
problem-drinking scale (Park, 1958) were com- 
pared in a sober condition and after the inges- 
tion of alcohol The problem-drinking scale is 
interpreted as measuring predisposition, or prone- 
ness, to alcoholism, and validation efforts by 
both Park, and Williams (1964) supported this 
interpretation. Since the scale is of recent origin, 
it has not been established that high scorers 
on problem drinking do eventually become alco- 
holics. However, in order to gain leads to the 
etiology of alcoholism, problem drinkers—heavy 
and frequent drinkers by definition—are con- 


3'These two variables were selected from a larger 
number of personality variables studied (Williams, 
1964). 
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sidered as though they were prealcoholics, and 
the effects of alcohol which they experienced 
were compared to those experienced by non- 
problem drinkers. 

There are good reasons for considering the 
possible relationship between college problem 
drinking and alcoholism, rather than working 
directly with alcoholics. In personality research 
with alcoholics, etiological statements are pre- 
cluded since investigators are unable to dismiss 
the possibility that their findings are the result 
of the numerous social and psychological conse- 
quences of 15 or 20 years of excessive drinking 
and loss of control over alcohol (cf. Jellinek, 
1952). Thus, for example, the findings that alco- 
holics are characterized by anxiety (Manson, 
1948) and depression (Zwerling, 1959) are no 
guarantee that as prealcoholics they were unduly 
anxious or depressed, since the social isolation 
and the various personal stresses which accom- 
pany alcoholism are very likely to foster, or 
aggravate, these "personality characteristics. And 
if the personality structure of the prealcoholic 
is a matter of conjecture, one is also left in 
doubt as to whether effects of alcohol experienced 
by alcoholics now were prominent prior to their 
becoming alcoholics, and perhaps contributed to 
the development of this disorder. By working 
with problem drinkers in young adulthood, it 
may be possible to isolate those personality char- 
acteristics which precede the development of 
alcoholism from any which follow primarily as 
a consequence of this disorder. And, once per- 
sonality characteristics of problem drinkers are 
known, possible reasons for their heavy and fre- 
quent drinking and perhaps eventual alcoholism 
can be explored. 


Hypotheses 


Based on an informal analysis of “tension” 
and “depression” words from the Gough Ad- 
jective Check List (Gough & Heilbrun, 1965) 
used by problem drinkers in a previous study 
(Williams, 1965), it is hypothesized that prob- 
lem drinking is positively associated with anx- 
lety and depression. Problem drinking is also 
hypothesized to be positively associated with 
amount of alcohol consumed. 

Clinical and introspective reports of the effects 
of alcohol have indicated that alcohol reduces 
anxiety and depression. Research which bears 
upon the notion that alcohol reduces anxiety 
has generally yielded favorable results, both in 
animal research, in which it has been demon- 
strated that alcohol lowers the avoidance gradient 
in approach-avoidance conflicts (cf. Conger. 
1956); and in research with humans, using skin 
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conductance and GSR as measures of tension 
(cf. Greenberg & Carpenter, 1957; Lienert & 
Traxel, 1959). It is hypothesized that anxiety 
and depression decrease under alcohol. It is 
further hypothesized that the absolute decreases 
in anxiety and depression are greater for problem 
drinkers than for nonproblem drinkers. 


METHOD AND PROCEDURE 


The study was run as a series of stag cocktail 
parties held for fraternity members in their fra- 
ternity houses at two men’s colleges in New York 
State. Five parties were held in all; three at one 
college and two at the other. The usual starting time 
for the parties was around 5 P.M,* and they varied 
in length from 60 to 70 minutes, A total of 91 sub- 
jects participated in the study; Ns of the five fra- 
ternity parties were 14, 18, 20, 19, and 20. 

The research was introduced as a study of the 
effect of various social atmospheres (party settings 
in this case) and of beverage preference on responses 
to psychological questionnaires. This introduction 
was given in order to keep subjects from realizing 
that alcohol was the major focus of the study. 

The parties were structured as little as possible 
by the experimenter. When all of the subjects were 
present, they were told that they could do whatever 
they liked and were urged to enjoy themselves; that 
there were gin, scotch, bourbon, and appropriate 
mixers available at the bar; and that they could get 
a drink by signing a 3 X 5 card, writing down the 
type of drink they wanted, and handing the card to 
one of the two bartenders hired by the experimenter. 
Each drink contained 2 ounces of liquor. 

Measures were taken three times: in the evening 
of the day preceding the party (Condition A), dur- 
ing the party aíter two drinks (4 ounces) (Condi- 
tion B), and at the end of the party after ad-lib 
drinking (Condition C). Since people drink at dif- 
ferent rates, subjects were not all tested at the same 
time at Condition B. However, in only two cases 
was there more than a 15-minute difference between 
the time the first and last subjects finished two 
drinks and completed the questionnaire. 

The problem-drinking scale, developed by factor 
analysis (Park, 1958), contains 13 items, several of 
which correspond to Jellinek's (1952) phasic char- 
acterization of symptoms and consequences of alco- 
holism. Twelve of these items have positive loadings; 
one has a negative loading.5 In scoring this variable, 


*In order to insure the safety of the students 
participating in the study, the parties were ar- 
ranged so that they would eat dinner in the house 
right after the party. In addition, since many of 
the students lived in íraternities, they could (and 
were requested to) remain there during the evening. 

5The 12 positively loaded items are: “has felt 
that subject might become dependent on or addicted 
to the use of alcoholic beverages"; “has incurred 
social complications due to drinking"; “has feared 
the long-range consequences of own drinking"; 
"drinks large or medium amount of alcoholic bev- 
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scores of +1 or —1 were assigned to positive re- 
sponses to the items. The theoretical range of scores 
on the problem-drinking scale was therefore —1 to 
+: 

Anxiety and depression were measured by means 
of adjective check lists developed by Zuckerman 
(1960) and Zuckerman, Lubin, Vogel, and Valerius 
(1964), respectively. The anxiety check list contains 
11 anxiety-plus words (e.g., afraid, desperate, tense) 
scored +1 if checked, and 10 anxiety-minus adjec- 
tives (e.g., calm, cheerful) scored +1 if not checked. 
The depression check list, scored in the same man- 
ner, contains 20 depression-plus adjectives (e.g., 
alone, blue, discouraged) and 20 depression-minus 
adjectives (e.g., active, clean, enthusiastic). 

The directions for descriptions on the anxiety and 
depression check lists in the preparty condition dif- 
fered in time set from the directions for the two 
alcohol conditions: In the preparty condition, a gen- 
eral time set was called for (how you feel generally) ; 
in the alcohol conditions, subjects were asked to 
describe themselves as they felt "right now—at the 
moment," 


RESULTS 


The range of problem-drinking scores in the 
present study was —1 to +12, and the overall 
problem-drinking mean (N = 91) was 5.3. The 
mean number of ounces of 86-proof alcohol 
consumed by the 91 subjects was 11.3, and the 
range was 4-28. Since most subjects were drink- 
ing their last drink while filling out the Condi- 
tion C questionnaires, the total ounce figures 
as a record of amount consumed are slightly 
elevated in many cases. 

For the preparty data, scores on the problem- 
drinking scale were correlated with the de- 
pendent variables for each of the five fraterni- 
ties, and the five correlations were averaged. 
The results indicated support for each of the 
hypotheses. Problem drinking was positively as- 
sociated with anxiety (r = +.37, p < .005, one- 
tailed), depression (r= +.26, p< .05, one- 
tailed), and amount of alcohol consumed (r= 
+.30, p < .005, one-tailed). 

Analysis of the alcohol data by means of ¢ 


erages at a sitting and more than once a week"; 
“likes to be one or two drinks ahead without others 
knowing it"; *has gone on the water wagon as the 
result of self-decision or advice of the family or 
friends"; *has had one or more drinks before or 
instead of breakfast"; “has become drunk when 
alone"; “has had one or more drinks alone"; “has 
gone on week-end drinking sprees"; "has been led 
by drinking to aggressive, wantonly destructive, or 
malicious behavior”; “has experienced blackouts in 
connection with drinking.” The last 6 items are 
scored if they have occurred one or more times. The 
negatively loaded item is: “drinks to comply with 
custom.” 
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TABLE 1 


MEANS OF ANXIETY AND DEPRESSION SCORES AND ¢ 
TESTS FOR DIFFERENCES BETWEEN MEANS 
For CONDITIONS A, B, AND C 


M t 


N 
A-B | B-C|A-C 


c 
4.1 5.1 | 3.75% |2.76** | 1.08 
9.8 | 11.1 | 3.38** |2.27* | .02 


B 


A 
Anxiety |91 (87)| 5.6 
Depression] 91 (87) | 11.3 


a Four subjects did not complete Condition C question- 
naires. The C mean, and B — C and A— C ¢ tests are there- 
fore based on an N of 87. 
*p « .05, two-tailed. 
** p « 01, two-tailed. 
9*3 « 001, two-tailed. 


tests (Table 1) indicated that from preparty to 
4 ounces (A-B), anxiety decreased (p< .001, 
two-tailed) and depression decreased (p< .01, 
two-tailed). From 4 ounces to end of party 
(B-C) both of these effects were reversed: 
anxiety increased (p< .01, two-tailed) and de- 
pression increased (p< .05, two-tailed). 

Plots of B-C shifts by amount of alcohol con- 
sumed revealed that the decrease in anxiety 
which was noted at the 4-ounce level continued 
to occur from B-C at 6 ounces, At 8 ounces 
and above, the B-C increase occurred, and this 
increase was approximately the same at each 
level of consumption above 6 ounces. Depression 
continued to decrease slightly at 6 and 10 
ounces. The B-C increase in depression was ap- 
parent at 8 ounces and was especially marked 
at 12 ounces and above. Since the distribution 
of B-C shifts by alcohol consumption was non- 
linear, subjects were divided into those who had 
consumed 6-10 ounces (N = 50) and those who 
had consumed 12 or more ounces (N = 37). A 
Mann-Whitney U test based on change scores 
yielded a Z of 2.11 (< .05, two-tailed), re- 
vealing that the B-C increase in depression was 
greater at levels of consumption above 10 ounces 
than at 6-10 ounces. 

Correlations computed between problem drink- 
ing and A-B, B-C, and A-C shift scores indi- 
cated that problem drinkers did not increase 
or decrease more than nonproblem drinkers on 
either anxiety or depression. 


Discussion 


Each of the personality variables found to 
be characteristic of problem drinkers has been 
found to characterize alcoholics, as noted earlier, 
Since alcoholism appears capable of producing 
these characteristics, the findings that problem 
drinking is related to anxiety and depression 
may be important ones, indicating that these 
aspects of personality precede the development 
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of alcoholism. It must be emphasized, however, 
that this conclusion hinges upon the validity of 
the problem-drinking scale in identifying pre- 
alcoholics. 

The decreases in anxiety and depression at 
low levels of alcohol consumption (from 4 to 
6 ounces, generally) are consistent with clinical 
reports and some research on the effects of 
alcohol. The increases in anxiety and depression 
at higher levels of consumption indicated that 
there is a dosage effect of alcohol. At low levels 
of consumption it is generally agreed that the 
major anesthetic effect of alcohol on the brain 
is to remove normally prevailing inhibitions and 
restraints, so that emotional stimulation (feel- 
ing "high") is the most apparent feature—at 
least when alcohol is consumed in relaxed and 
natural settings. At this stage one would expect 
emotional changes like the decreases in anxiety 
and depression noted. With increasing dosage 
levels, however, the anesthetic effect builds, 
resulting in a progressive impairment of func- 
tions. The subjects who increased in anxiety 
and depression from B-C had consumed sufficient 
amounts of alcohol for the anesthetic effects to 
be marked. These would include tiredness, dull- 
ness, nausea, a decrease in perception, inability 
to control one’s actions, ond inability to com- 
prehend what is going on around one. Such 
factors are likely to create anxiety and de- 
pression in the drinker, and can probably account 
for the B-C increases noted.¢ 

Concerning the question of why people drink, 
the decreases in anxiety and depression which 
occurred at low levels of alcohol consumption 
are “positive” effects and can help to account 
for the widespread occurrence of moderate social 
drinking. On the other hand, no answer is avail- 
able to the question of why some people drink 
to excess, as the effects of social drinking on 
anxiety and depression were reversed at higher 
levels of consumption, changing to “negative” 
effects. Since the average amount of alcohol 
consumed in this study was 11.3 ounces, some- 
what above the point at which the “positive” 
effects disappeared, the question of why people 
drink to excess is especially pertinent. 

Although the data of this study cannot account 


®Since the anesthetic effect of alcohol at high 
consumption levels is likely to affect the drinker’s 
ability to respond to psychological questionnaires, 
thereby altering the reliability and validity of such 
instruments, the interpretation given for the B-C 
increases in anxiety and depression involves the 
assumption that the reliability and validity of the 
measures of these variables did not change appre- 
ciably throughout the entire experimental session. 
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for excessive drinking, it remains possible to 
consider what psychodynamic functions alcohol 
serves for problem drinkers which it does not 
also provide for nonproblem drinkers. Problem 
drinkers were shown to be higher than non- 
problem drinkers on preparty measures of anx- 
iety and depression. When anxiety and depres- 
sion decreased under alcohol, problem drinkers 
did not decrease more on these variables than 
did nonproblem drinkers. It appeared, rather, 
that under alcohol, problem drinkers became 
more nearly “normal” on anxiety and depression; 
that is, closer to what nonproblem drinkers 
were like in a sober condition.’ It is reasonable 
to assume that persons with relatively high 
amounts of anxiety and depression appreciate 
the relief afforded by the decreases in anxiety 
and depression under alcohol more so than per- 
sons characterized by normal amounts of these 
variables, and that they may tend to drink 
frequently in order to obtain this temporary 
relief. In short, it would seem that the motiva- 


tion for drinking would be stronger and more .- 


insistent for problem drinkers, a condition which 
may have contributed to their becoming prob- 
lem drinkers and may eventually lead to their 
becoming alcoholics. 


7 Based on the top and bottom 40% of the distri- 
bution on problem drinking, the Condition A and 
Condition B means for problem drinkers (scores 
from +7 to +12, N=36) and nonproblem drinkers 
(scores from —1 to +4, N=36) on anxiety and 
depression were as follows: for anxiety—problem 
drinkers, A=6.5, B=5.2; nonproblem drinkers, 
A=4.9, B=4.0. For depression—problem drinkers, 
A=13.1, B=11.8; nonproblem drinkers, A = 10.9, 
9. 


REFERENCES 


— 

Concer, J. J. Reinforcement theory and the dynam- 
ics of alcoholism. Quarterly Journal of Studies on 
Alcohol, 1956, 17, 296-305. 

Govcn, H., & Hrmsrun, A. B. The Adjective Check 
List manual. Palo Alto, Calif.: Consulting Psy- 
chologists Press, 1965. 

GREENBERG, L., & CARPENTER, J. A. The effect of 
alcoholic beverages on skin conductance and emo- 
tional tension: I. Wine, whiskey and alcohol. 
Quarterly Journal of Studies on Alcohol, 1957, 18, 
190-204. 

Jeuumvex, E. M. Phases of alcohol addiction. 
Quarterly Journal of Studies on Alcohol, 1952, 13, 
673-684, 

Kar, R., McCretranp, D. C, & Kann, M. The 
effects of male social drinking on fantasy. Jour- 
nal of Personality and Social Psychology, 1965, 1, 
441-452. 

Lienert, G. A, & Traxet, W. The effects of me- ^" 
probamate and alcohol on galvanic skin response. 
Journal of Psychology, 1959, 48, 329—344. 


BRIEF ARTICLES 


Manson, M. P. A psychometric differentiation of 
alcoholics from nonalcoholics. Quarterly Journal 
of Studies on Alcohol, 1948, 9, 175-206. 

Panx, P. Problem drinking and social orientation. 
Unpublished doctoral dissertation, Yale Univer- 
sity, 1958. 

WirrrMs, A. F. Psychological effects of alcohol in 
natural party settings. Unpublished doctoral dis- 
sertation, Harvard University, 1964. 

Witttams, A. F. Self concepts of college problem 
drinkers: I, A comparison with alcoholics. Quar- 
terly Journal of Studies on Alcohol, 1965, 26, 586- 
594. 


Journal of Personality and Social Psychology 
1966, Vol. 3, No. 6, 693-696 


693 


ZUCKERMAN, M. The development of an affect ad- 
jective check list for the measurement of anxiety. 
Journal of Consulting Psychology, 1960, 24, 457- 
462. 

ZUCKERMAN, M. Lusy, B., Vocet, L., & VALERIUS, 
E. The measurement of experimentally induced 
affects. Journal of Consulting Psychology, 1964, 
28, 418-425. 

Zweruinc, I. Psychiatric findings in an interdisci- 
plinary study of forty-six alcoholic patients. 
Quarterly Journal of Studies on Alcohol, 1959, 20, 
543-554. 

(Received March 15, 1965) 


ACHIEVEMENT MOTIVATION AND TASK RECALL IN 
COMPETITIVE SITUATIONS 


BERNARD WEINER? » 


Center for Personality Research, University of Minnesota 


The recall of completed and incompleted tasks was used to investigate the 
motivational effects of social context on achievement striving. 3 experimental 
conditions were employed: a male competing against a male, a female competing 
against a female, and a male competing against a female. Results,showed males 
exhibited a significantly greater Zeigarnik effect after competing against females 
than after competing against males. Female Ss also showed a greater Zeigarnik 
effect when competing against females than when competing against males, 
although the difference in recall between the females in the 2 competitive 
E conditions was not statistically significant. An objective test used to measure 
f achievement motivation predicted differential recall of incompleted and com- 
pleted tasks for male Ss but not for females. Because the test items were 
derived from Atkinson’s (1957) theory of achievement motivation, the results 
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tend to validate the test and the theory. 


It has been asserted (Mead, 1949) that the 
norms of our society prohibit females from com- 
peting in achievement-oriented activities. This 
general hypothesis has received empirical sup- 
port from investigations which demonstrated that 
females are less effective at problem solving 
than males (Hoffman & Maier, 1961), and that 


, females classified as relatively masculine on the 


basis of score on a masculinity-femininity scale 
perform better at problem-solving tasks than 


: females classified as relatively feminine (Milton, 


1957). Mead’s assertion also has been supported 
in a study by Lesser, Krawitz, and Packard 


1This study was in part conducted while the 
author was at the University of Michigan. The 
author wishes to thank Sherrie Lindborg, Patricia 
O'Connor, and Phillip Newman for their valuable 
assistance. 

Now at the University of California, Los Angeles. 


(1963) which revealed that underachieving girls 
may not perceive intellectual achievement as ap- 
propriate to their social roles. 

The studies cited above have stressed that 
males and females differ in their tendencies to 
inhibit achievement strivings. This study investi- 
gates the relationship between the inhibition of 
achievement motivation and the ‘social context 
in which the achievement-related behavior oc- 
curs. To determine the effects of social context 
on achievement strivings, the performance of 
subjects in three two-person competitive situa- 
tions will be compared. The situations are a 
female competing against a female, a female 
competing against a male, and a male compet- 
ing against a male. 

In the studies of Hoffman and Maier (1961) 
and Milton (1957) the response indicator of 
achievement motivation was performance on 
problem-solving tasks. However, this may not 
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be the best dependent variable when comparing 
the achievement motivation of males and fe- 
males, Males may have had more commerce and 
more success experiences with such tasks than 
females. Another possible indicator of aroused 
achievement motivation is the differential recall 
of incompleted and completed tasks. Atkinson 
(1953) and Atkinson and Raphelson (1956) 
have demonstrated that when achievement moti- 
vation is high, subjects recall more incompleted 
than completed tasks (the Zeigarnik effect) in 
achievement-oriented conditions. Atkinson (1953) 
also cited data indicating that volunteer subjects 
are relatively high in need for achievement (n 
Achievement), and Green (1963) demonstrated 
that volunteer subjects exhibit a greater Zeigar- 
nik effect (Zeigarnik, 1927) than nonvolunteer 
subjects. Thus there is strong evidence that the 
differential recall of incompleted tasks is a valid 
behavioral criterion of aroused achievement mo- 
tivation. In this study the social facilitation and 
social inhibition» of achievement strivings will be 
investigated by comparing the recall of incom- 
pleted and completed tasks between subjects in 
the liked-sex and mixed-sex competitive condi- 
tions. 

A second purpose of this study is to validate 
an objective measure of r? Achievement (O'Con- 
nor, 1962). The selection of items for this test 
(the Achievement Risk-Preference Scale) was 
guided by the theory of achievement motivation 
formulated by Atkinson (1957). Atkinson's the- 
ory specifies that individuals high in resultant 
achievement motivation are concerned about 
Success, tend to engage in achievement-related 
activities, and prefer tasks of intermediate diffi- 
culty. On the other hand, the theory states that 
individuals low in resultant achievement moti- 
vation are concerned about failure, tend to avoid 
achievement-related tasks, and prefer tasks which 
are too easy or too difficult in relation to their 
level of ability, The items on the Achievement 
Risk-Preference Scale (ARPS) reflect these theo- 
retical differences. The items involve choices 
between: the kind of affect, hope or fear, asso- 
ciated with achievement tasks; the direction of 
behavior, approach or avoidance, elicited by 
achievement-related tasks; and the level of diffi- 
culty, intermediate yersus easy or hard, selected 
when constrained within an achievement-oriented 
situation. Some typical items are: 


1. I feel: 
a) unhappy when I do something less well 
than I had expected, 
b) happy when I do something better than I 
had expected. 
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2. When I'm reading a magazine and come across 
puzzles or quizzes I: 
a) often stop and try them. 
b) rarely stop and try them. 

3. If I were a pinch hitter, I would like to come 
to bat when: 
a) my team was leading 6 to 3. 
b) the score was tied. 


It was hypothesized that subjects scoring high 
on this proposed measure of achievement moti- 
vation would exhibit a greater Zeigarnik effect 
than subjects scoring low on this measure. 


METHOD 


Thirty-three males and 37 females enrolled at the 
University of Michigan or the University of Minne- 
sota participated in the experiment. Pairs of sub- 
jects were brought into the experimental room and 
were seated at the far ends of a table, facing the 
experimenter. A partition was placed between the 
two subjects. Each subject was given a “Zcigarnik 
booklet” which contained 20 simple puzzle tasks, 
for example, connecting dots, anagrams, etc. Every 
puzzle had two forms, long and short. It is very 
rare that long puzzles are completed within the 
allotted time period, while short puzzles almost 
always are completed. Four random sequences of 


puzzles were selected so that each booklet contained |, 


10 tasks of the long form and 10 of the short form. 

Subjects were told that the experiment investi- 
gated the effects of competition on performance. 
The experimenter said that the subjects would be 
competing against one another, and that points 
would be awarded for successful completion of each 
task. The subjects were told that after they had 
attempted all the tasks their points would be totaled, 
and the "winner" would be the person with the 
highest number of points. 

Subjects were asked to raise their hands to signal 
the successful completion of a task. Each individual, 
therefore, could gauge how his performance com- 
pared to that of the competitor. On one-fourth of 
the tasks both subjects succeeded, on one-fourth of 
the tasks both subjects failed, and on one-half of 
the tasks one subject succeeded while the other 
failed. Seventy-five seconds were allowed for each 
task. When subjects raised their hands, the experi- 
menter recorded the results to give the impression 
of keeping score. 

After completing the 20 tasks the booklets were 
collected, and the ARPS was administered. During 
this time interval the experimenter pretended to add 
the scores. The ARPS consists of 40 items for males 
and 50 for females; the test takes approximately 7 
minutes. After completing the test subjects were 
asked to recall the tasks in the Zeigarnik booklet. 
Two minutes were allowed for task recall. 


RESULTS 


Forty of the subjects recalled a greater per- 
centage of incompleted than completed tasks, 28 
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TABLE 1 


MEAN PERCENTAGE RECALL, INCOMPLETED 
Minus COMPLETED Tasks 


Sex of subject 


Sex of 
competitor 
N Male N Female 
Male 14 —8.00% 19 21% 
Female 19 13.36% 18 6.94% 


recalled a greater percentage of completed than 
incompleted tasks, and for two subjects there was 
equal recall of completed and incompleted tasks. 
The mean difference between the recall of incom- 
pleted and completed tasks was not statistically 
significant, ¢ = 1.42, df = 69, p < .20. 

Table 1 shows the relation between the ex- 
perimental conditions and the mean percentage 
recall of incompleted minus completed tasks. 
Table 1 indicates that both male and female 
subjects recalled relatively more incompleted 
than completed tasks when competing against a 
female than when competing against a male. 
An analysis of variance of the recall of in- 
completed minus completed tasks reveals that 
there is a significant main effect, F — 8.94, df — 
1/66, p<.01, attributable to the sex of the 
competitor. Further analysis indicates that males 
exhibit a significantly greater Zeigarnik effect 
when competing against females than when 
competing against males, #=3.02, df= 31, 
p<.01, Only 3 of the 19 male subjects com- 
peting against females recalled a greater per- 
centage of completed than incompleted tasks, 
while 8 of the 14 males competing against other 
males recalled more completed than incompleted 
tasks. For female subjects the difference in recall 
between the two competitive conditions was not 
significant, £ = 1.50, df — 35, p<.15. 

Relating scores on the ARPS to task recall 
reveals that males classified as high (above the 
median) on the ARPS recalled relatively more 
incompleted than completed tasks than males 
classified as low on this measure, £ = 2.59, df = 


' 31, p < .01. For females there was no significant 


relation between recall and motive classification, 
£— 1.2, df —35, p<.20, although the direc- 
tion of the results was identical to that of the 
males. This relationship was not enhanced when 
females were classified jointly according to sex 
of the competitor and score on the ARPS. 


Discussion 


The data revealed that females exhibit a 
greater Zeigarnik effect when competing against 
other females than when competing against males. 
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However, the difference in recall between the 
two conditions did not reach an acceptable level 
of significance, and the data do not clearly con- 
firm the intuitive notion that females suppress 
achievement strivings when the social context is 
one of competition against males. 

The data for the male subjects revealed that 
the Zeigarnik effect was maximized when compe- 
tition was against a female. This finding, con- 
sidered in conjunction with the data for female 
subjects, suggests that role theorists might find it 
profitable to reverse their perspective and attend 
to the enhancing effect which females have on 
male achievement strivings, rather than the in- 
hibiting effect which males are presumed to have 
on female achievement strivings. $ 

The results relating scores on the ARPS to 
task recall indicate that this test may be a valid 
motive measure. The findings replicated those of 
Atkinson (1953) and Atkinson and Raphelson 
(1956) when a Thematic Apperception Test 
(TAT) was used to assess achievement motiva- 
tion. It is of interest to note that the relation 
between recall and scores on the ARPS supports 
the prediction for male subjects, while the re- 
sults for females do not confirm the hypothesis. 
The absence of a significant relationship between 
the ARPS and the dependent variable for female 
subjects parallels the frequent lack of significant 
results when employing the TAT as a motive 
measure for females (McClelland, Atkinson, 
Clark, & Lowell, 1953). 

Employing the ARPS as a motive measure 
presumed that the criteria for high achievement 
motivation as specified by Atkinson (1957) 
would covary with other performance indicators 
more than does the n Achievement score derived 
from the TAT. That is, it was thought that the 
reported preference for intermediate risks and 
the tendency to engage in achievement-related 
activities would predict differential task recall 
better than would the motive scores derived from 
the fantasy measure of n Achievement. The test 
was “bootstrapped” by the theory, and the posi- 
tive results in this study tend to validate the 
theory as well as the test (Cronbach & Meehl, 
1955). Further studies relating the ARPS to 
other performance measures are neéded to en- 
hance the validity of this measure. 
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OBSERVER PRACTICE AND LEARNING DURING EXPOSURE 
TO A MODEL? 


SEYMOUR M. BERGER 


Indiana University 


The observers (O's) behavior was examined in relation to a model’s (M's) 
choices when O had a separate source of information about task-relevant be- 
havior. These experiments showed that: (a) O's practiced the M's choices dur- 
ing the exposure period, even without overt performance by M; (b) subse- 
quently, O's did not necessarily choose to perform in a learning task in a 
manner that was consistent with their prior practice; (c) O's practice of M's 
Choices was not dependent upon instructions regarding the subsequent learning 
task; (d) O's practice and learning was largely dependent upon M's choices 
rather than the general popularity of these choices for learning. It is suggested 
that observational learning is the result of an ongoing tendency for O's to 
practice M's behaviors during the exposure period. 


There is evidence to show that the behavior 
of one person (observer) can be influenced by 
exposure to the behavior of someone else 
(model). Bandura and Walters (1963) have re- 
viewed much of this literature in relation to 
imitation and social learning. In the general 
paradigm employed in these studies, the observer 
is exposed to the model's behavior and is subse- 
quently tested to determine the effect of the 
model's behavior on the observer’s learning. In 
these studies, the observer's behavior during the 
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exposure period has generally been ignored. An , 


examination of such behavior may be of rele- : : 


vance to the investigation of how the model's 
behavior can affect observational learning. This 
approach is based upon the general theoretical 
assumption that the observer, far from being a 
passive bystander, is actively engaged in prac- 
ticing the behavior of others. Viewpoints in 
agreement with the foregoing have been ex- 
pressed by Hebb (1960, p. 742) and Cook (1961, | 
pp. 355-356). This notion of active participa- 
tion is supported by studies of changes in GSR 
(Berger, 1962) and changes in heart rate (Kagan 
& Phillips, 1964) during exposure. Besides these 
physiological manifestations, covert practice by 
the observer has been regarded as a relevant 
component of social learning (e.g., Maccoby, “ 
1959). The investigation of the role of the ob- 
server’s practice, however, has been limited by 
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the lack of immediate overt behavioral measures 
which can be related directly to the model’s 
behavior. 

The studies reported below employ such a 


» measure. The observers were exposed to a model 
' learning items from the manual alphabet for the 


deaf. When apparently out of sight of the experi- 
menter, observers tend to practice these hand 
signals overtly and distinctly. Consequently, it is 
possible to record an observer’s reactions during 


his exposure to the model and determine how 


the model's behavior affects the observer's prac- 
tice and subsequent measures of his learning. 

Each of the experiments reported below deals 
with different but related aspects of the rela- 
tionship between the model's choices and the 
observer’s behavior. 

Experiment I investigates whether the model's 
choices of items to learn, and overt performance 
of these items, affect the observer's practice and 
choice when the observer is told that he will later 
participate in the same experiment. 

Experiment II examines whether the effect 
of the model's choices on the observer's practice 
is dependent upon the instructions that the 
observer will subsequently participate in the same 
learning task. 

Experiment IIT investigates whether the gen- 
eral preference for individual items, independ- 
ently of the model's choice, affects the observer's 
practice and learning. 


GENERAL SETTING AND PROCEDURE 


Three rooms, separated by one-way mirrors, were 
used for these studies. The observer's (middle) room 
was mirrored on both sides so that observers (who 
were run individually) had a clear view of the 
model's room and, in turn, were in full view of a 
human recorder in the third room. The observers 
could listen to the activity in the model’s room over 
a loudspeaker. A copy of the learning materials, con- 
taining all of the pairs of letters and hand signals 
used in the manual alphabet for the deaf, was taped 
at about eye level to the mirror separating the ob- 
server’s and model’s rooms, and was in full view of 
the observer. 

Except for observers assigned to the control group 
in Experiment I, the experimenter seated the ob- 
server in the observer’s room and read instructions 
which informed him that shortly he would see and 
hear another person in the next room participating 
in a learning experiment. Except for the observers 
assigned to Group W (watching instructions) in 
Experiment II, each observer was told that he 
would participate in the same experiment after the 
other person had completed the experiment and left 
the room, and that the instructions and learning 
materials would be the same for both of them. The 
experimenter identified the sample copy of the 
manual alphabet for the deaf on the mirror as the 
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learning materials. The experimenter left the room 
and, after a few moments delay, appeared in the 
model's room with the model, who was the experi- 
menter's confederate. After seating the model in full 
view of the observer, the experimenter read the 
instructions. The experiment was described as a 
study of learning which differed from traditional 
learning experiments; instead of the experimenter 
selecting the learning materials, the model would be 
given some choice. The learning materials were the 
pairs of letters and hand signals comprising the 
manual alphabet for the deaf. The model was to 
select 6 out of the 26 pairs to learn, the only re- 
strictions being that the pairs could not be chosen 
in alphabetical order. The model was given a copy 
of the manual alphabet for the deaf (which was 
exactly the same as the copy in the obseryer’s room) 
and was asked to make each selection aloud so 
that the experimenter could record it. The experi- 
menter then described the training procedure in 
which the model would have 5 seconds (as signaled 
by the experimenter) to learn each pair. The model 
would announce the letter of each pair as she tried 
to learn it. Finally, the instructions described the 
matching type of recognition test that would be used 
to determine how much the model had learned. The 
experimenter would call out the letters of the pairs 
chosen by the model, and the model would point to 
the appropriate hand signal on an answer sheet 
which contained drawings of all 26 hand signals in 
random order. 2 

The procedure followed after the model left the 
model’s room is described for each experiment. 

All observers (except those in Group C—the con- 
trol group—of Experiment I) were under the sur- 
veillance of a human recorder in the third room who 
noted all of the hand signals that each observer 
performed. 


EXPERIMENT I 
Method 


Subjects and procedure. A total of 72 undergradu- 
ate students at Indiana University were arbitrarily 
assigned to three groups of 24 observers each. 

The observers in Group P (performance) were 
seated in the observer's room and watched the model 
perform hand signals while making her selections and 
while practicing during training. During the selec- 
tion phase, the model announced the letter chosen 
and performed the hand signal once, holding it for 
about 1 second. During the training phase, she an- 
nounced the letter she was practicing and performed 
the hand signal simultaneously, repeating this per- 
formance three times during the 5-second interval 
for each pair. For observers in Group NP (no per- 
formance), the model announced the letters of the 
pairs during selection and training in the same man- 
ner as for Group P, but she did not perform the 
hand signals at any time. 

After the model left, the experimenter brought the 
observer from the observer's room to the model's 
room and instructed him that since he had heard 
the instructions for the experiment he would start 
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by choosing the 6 pairs he wished to learn from the 
list of 26 pairs. If the observer asked whether he 
could choose the same pairs as the model, the ex- 
perimenter repeated the following phrase from the 
instructions: *You may select any six you wish." 
After the observer made his selections and the ex- 
perimenter recorded them, the experimenter informed 
him that the experiment was over and that he did 
not have to go through the learning procedure. The 
Observer was told that the experimenter was in- 
terested in the extent to which the observer's choices 
were influenced by the choices of the model. The 
observer was thanked for his participation and asked 
not to discuss the experiment with anyone else. 

The observers in Group C, the control group, 
were brought directly to the model’s room; then the 
experimenter read the instructions which were read 
to the model for the other two groups, and obtained 
the control observer’s choices of 6 pairs to learn. 
The experimenter informed the observer that he 
would not go through the learning procedure. The 
observer was told that the experimenter was in- 
terested in determining which pairs would be chosen 
if the observer theught he had to learn them. Again, 
the observer was thanked for his participation and 
requested not to discuss the experiment with any- 
one. 

In order to obtain a basis for determining the 
extent to which the model’s choices influenced the 
observer’s practice and subsequent choices of pairs, 
two lists of 6 pairs were üsed by the model for 
Groups P and NP. Half of the observers in each 
group observed the model select one list while the 
other half witnessed the model select the other list. 
Since pilot work revealed that some pairs were more 
likely to be chosen than others, two lists provided a 
basis for comparing the specific influence of the 
model's choices on the observer’s practice and 
choice, while holding the preference for the items 
constant, Both lists were composed of relatively 
popular items which appeared in similar locations 
on the copy of learning materials, 

In addition to controlling the model's choices, the 
models success on the recognition test was pre- 
determined in order to see if this was a factor 
influencing the observer's subsequent choices. The 
model always correctly matched half of the pairs on 
the recognition test and gave no response for the 
other half. The three correct matches were counter- 
balanced over observers in each of the two experi- 
mental groups. 


Results and» Discussion 


Observer practice. The number of different 
signals practiced by 11 observers in the experi- 
ment was noted by two recorders, All other 
observers were observed by only one recorder. 
The interrecorder reliability coefficient was .98: 
this is consonant with a coefficient of 99 
obtained in a previous study. 

The data analysis of the observer’s practice 
was based upon the signals performed by the 
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TABLE 1 


MEAN Proportion OF DIFFERENT HAND 
SIGNALS PRACTICED 


Group .| Models | Altsmate | Remeining | Total 
P 62 AT 06 A 
NP 44 25: al 22 


observer from the time the model announced her 
first choice until the observer left the observer's 
room. 

One or more hand signals were practiced by 
83% of Group P and by 71% of Group NP. 
The results, presented in Table 1, show that 
Group P practiced more signals from the 
model's list than from the alternate list (T — 0, 
P«.01, N=18, Wilcoxon signed-ranks test). 
Greater practice of the model’s list was also 
found in Group NP (T — 13, p<.01, N = 14). 
"This difference in practice between the two lists, 


however, was greater for Group P than for , . | 


Group NP (z — 2425, p<.05, Mann-Whitney U 
test). 

Clearly, the observers tended to practice in 
accordance with the model's announced choices. 
In addition, performance of signals by the model 
resulted in increased practice of these signals 
by the observer. 

Observers’ choice of items to be learned. A- 
though items to be learned were not generally 
selected according to the model's choices, there 
were individual cases in which all or none of the 
model's choices were selected. This produced a 
larger variability in Groups P and NP than in 
Group C (x? = 10.26, <.01, Bartlett's test). 
The variances of Groups P and NP differed from 
Group C but not from each other (P versus C, 
F—343, p.01; NP verus C, F=3.48, 
5 «.01). Although a systematic effort was not 
made to investigate the sources of these differ- 
ences in variability, informal questioning re- 
vealed that many observers avoided the model's 
choices because they had learned those pairs 
already and believed it would have been inap- 
Salad to select those pairs for the learning 

No relationship was found between the cor- 
rectness of the model’s performance on the 
recognition test and the subsequent selections 
made by observers in either Groups P or NP. 


EXPERIMENT II 


In Experiment I the observers practiced the 
signals chosen by the model but did not tend ;^ 
to choose them for the subsequent learning task. 
It appears, therefore, that practices of model's 
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choices is not just preparation for the task 

ahead. This study investigates the effects of in- 
e structions regarding the subsequent learning task 
on the observer's practice. 

If it is assumed that observers practice the 
model’s choices in order to learn the materials, 
or to somehow “warm up" for the task ahead, 
one would expect little or no practice if observers 
are simply instructed that after having watched 
the model they would be allowed to leave. Pilot 
' work suggested that observers who were asked 

just to watch the model tended to practice the 

signals chosen by the model, but very few other 
signals; observers under the usual instructions 
concerning subsequent participation in the learn- 

. ing task also tended to practice signals chosen by 
the model, but they practiced several other 
signals as well. 


Method 


Subjects and procedure. A total of 60 undergradu- 

ate students, who received course credit for par- 

* ticipation, were alternately assigned to one of two 
groups, The 30 observers in Group L (learning in- 

. Structions) were given the same instructions as 
. Group P in Experiment I. Each of the 30 observers 
in Group W (watching instructions) was told that 
two students had accidentally signed up for the 
same time. In view of this administrative error, the 
experimenter would give him credit for coming if he 

+ just watched the other student go through the ex- 
periment. The instructions indicated that he would 
be free to leave after the other student was finished. 
The instructions to the model were the same as 
described previously for Group P; the model be- 
haved accordingly, except that she used only one of 
the lists of 6 pairs from Experiment I for her choices. 


Results 


In all, 90% of the observers in Group L and 
83% of the observers in Group W performed 
one or more hand signals. 

Before the model announced her selections, 
there were no reliable differences in the propor- 
tions of signals performed either by the two 
groups or from the two lists (Table 2, left 
portion). After the model made her selections, 
however (Table 2, right portion), both groups 
practiced a greater proportion of items from the 
model' list than from the remaining items 
(Group L, T=0, p<.01, N —26; Group W, 
T=0, p<.01, N —25). Both groups showed 
almost the same amount of practice from the 
model's list (z —.24, Mann-Whitney U test), 
but Group L practiced more of the remaining 
items than Group W (z= 2.33, p < 02). Conse- 
quently, it appears that the observer's practice 
of the model's choices is not dependent upon 
instructions concerning subsequent participation 
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TABLE 2 


MEAN PROPORTION OF DIFFERENT HAND 
SIGNALS PRACTICED 


Prior to the model's After the model’s 


selections selections 
Group| 
E mE Total Wit eee Total 
L .09 10 40 .77 E 26 
w 40 40 40| .73 04 20 


in the same situation, but that practice of the 
remaining pairs is related to these instructions. 


EXPERIMENT III , 


In the foregoing studies the model's lists were 
composed of items which had been chosen more 
frequently than others in exploratory studies. 
Experiments I and II have shown that the 
model's choice influences the observer's practice 
of these popular items. The present study in- 
vestigates whether the observer's practice and 
learning of unpopular items increases when the 
model chooses unpopular rather than popular 
items. 


Method . 


Subjecis and procedure. A total of 24 undergradu- 
ate students were alternately assigned to one of two 
groups of 12 observers each. The observers assigned 
to Group PO (popular) were exposed to the model 
who chose popular items and those assigned to 
Group UP (unpopular) were exposed to the same 
model who chose unpopular items. The two lists of 
items were based upon the choices of the control 
observers (Group C) in Experiment I. The list for 
Group PO was composed of items frequently chosen 
for learning by Group C (popular items), while the 
list of items for Group UP were those chosen in- 
frequently by these observers (unpopular items). 

The general procedure for this study was the 
same as that used for Group P in Experiment I, 
except for the introduction of a learning test immedi- 
ately following the demonstration period. As soon as 
he was seated in the model’s room, the observer was 
asked to announce the letter and perform the ap- 
propriate hand signal for all the items he knew. The 
criterion for a correctly performed hand signal was 
an exact match of the pictorial version in the observ- 
er’s room, although right-left reversals were accepted. 


Results and Discussion 


Practice and item popularity. In all, 83% of 
Group PO and 75% of Group UP performed one 
or more hand signals. The observer’s practice of 
hand signals was recorded following the model’s 
selections and was classified into two categories: 
performance of popular items, and performance 
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TABLE 3 


MEAN PROPORTION OF PRACTICE AND LEARNING 
TOR DIFFERENT CATEGORIES OF ITEMS 


Popular items | Unpopular items Total 
Group 

Prac- | Learn- | Prac- | Learn- | Prac- | Learn- 

tice ing tice ing tice ing 


PO | 47 46 06 .00 18 Bi 
UP | .36 Bt 36 26 28 42 


of unpopular items. The proportions of different 
items practiced in the two categories are pre- 
sented in Table 3. Group UP practiced more 
unpopular jtems than Group PO (Mann-Whitney 
U =32, p < .05), but the groups did not differ 
reliably in their performance of popular items. 
Group PO practiced more popular items than 
unpopular ones (T —0, p<.01, N —9), but 
for Group UP this difference was not reliable. 

The influence of the model’s choice was thus 
effective in overcoming the unpopularity of the 
items in Group UP, but apparently had little 
effect in the case of popular items. 

Learning and item popularity. The propor- 
tions of different items learned in the two cate- 
gories are presented in Table 3. Group PO 
learned more popular items than Group UP 
(U = 23, p<.01), and Group UP learned more 
unpopular items than Group PO (U=24, p 
<.01). 

More popular items than unpopular ones were 
learned by Group PO (T =0, p < .01, N= 11); 
for Group UP this tendency was reversed but 
the differences were not reliable. 

The correlation between performance of dif- 
ferent hand signals and the number of items 
correctly recalled was .47 (p < .05). 

The results of this study indicate that the 
observer’s practice and learning of unpopular 
items increases when the model chooses un- 
popular rather than popular items. 


Discussion 


The occurrence of overt practice during ex- 
posure to the model’s behavior was demonstrated 
in each of the three studies. An overall mean 
of 81% of the observers practiced one or more 
hand signals from the model’s list. It was also 
demonstrated (Experiment IIT) that level of 
retention is directly related to the number of 
different signals practiced by the observer. It 
may be concluded, therefore, that the model’s 
choice behavior determines to a large extent 
Dt the observer practices, and hence, what he 
learns, 


These results are not in accord with social 
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learning theories that attribute the observer’s 
learning to the effect of immediate or anticipated 
reinforcement. The observers were not directly v 
reinforced for imitating the model’s responses as 
is the case in matched-dependent behavior situa- 
tions (Miller & Dollard, 1941). Nor was there ' 
evidence to suggest that the observers behaved 
in accordance with anticipated reinforcements 
(e.g., Rotter, 1954), since in Experiment I the 
observer's practice was not related to their subse- . 
quent choices on the learning task. Furthermore, 
in Experiment II, Group W practiced the 
model's choices even though the observers were 
instructed that they would not participate in 
the experiment, and thus had little basis for 
anticipation of reinforcement. 

Reinforcement principles (as currently em- 
ployed in social learning theories) may account 
for some observational learning effects, but the 
present results show that observational learning 
can occur independently of reinforcement. 

In these studies there is an ongoing tendency 
to imitate the model during the exposure period. 
This imitation is a form of practice which is 
positively correlated with a measure of observa- 
tional learning. However, the moderate degree 
of this correlation suggests that other factors 
are operating which must be considered as well. 
Recently, Bandura (1962) has proposed that 
sensory contiguity is a sufficient condition for 
observational learning, since observers "often 
[learn] without any opportunity to perform the 
models behavior in the exposure setting . . - 
[p. 216]." There is some evidence in Experi- 
ment III that observational learning does occur 
without overt response. Among the 24 observers 
in that experiment, 5 observers did not perform 
any hand signals (according to the records ob- 
tained), yet 4 of these observers did show some 
learning. Furthermore, even among the observers 
who did perform, observers occasionally learned 
pairs which they had not practiced. It should be 
noted, however, that the definition of overt 
response is inevitably arbitrary. Only responses 
which were identifable as hand signals were 
recorded in these studies; no recordings were 
made of fragmentary hand movements, vocaliza- 
tions, or mouth movements, all of which fre- 
quently were observed. It appears, therefore, that 
more intensive research is needed in order to 
understand the role of overt responding in 
observational learning. 
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DISSONANCE AND THE REVISION OF CHOICE CRITERIA * 


DONALD D. PENNER, GORDON FITCH 


Purdue University 


AND KARL E. WEICK 


University of Minnesota 


If attributes of a chosen alternative are cognitively enhanced to justify a 
decision, additional justification should occur if the enhanced attributes be- 
come prescriptive for subsequent decisions. The present experiment examines 
the plausibility of this proposed link between evaluation and behavior. Ss 
assumed the role of an employer about to hire a vice-president. After rating 
the importance of 8 traits for this position, they read descriptions of 2 candi- 
dates, chose 1 of them, reevaluated the traits, chose between 2 additional 
candidates, and completed a 3rd rating of the traits. y? analyses showed that Ss 
reevaluated the traits on the 2nd rating so that they were more consistent with 
the initial decision. Furthermore, the 2nd set of ratings was a better predictor 
of the 2nd choice than were the initial ratings. 


Once a decision is made, attributes of the 
chosen alternative often take on added value. 
Although this enhancement affords some justifi- 
cation for the decision, further support should 
occur if the cognitive realignment is followed by 
consistent acts (Weick, 1964). Specifically, if 
attributes of the chosen alternative are judged to 
be important, this evaluation gains plausibility 
if these attributes actually are used as criteria 
for subsequent choices, Attributes that are pre- 
scriptive are attributes that are valid. Thus, con- 
ditions are created where “changes in cognitions 
about beliefs and about behavior operate as com- 
plementary or mutually reinforcing tactics of 
dissonance reduction [Weick, 1964, p. 539].” 

It is not immediately obvious that postchoice 
reevaluations need be as prescriptive for subse- 
quent actions as is suggested. It is possible that 


1Support for this study was provided by the 
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temporary cognitive changes are sufficient to 
reduce postdecision dissonance. Reevaluations may 
persist only until the discomfort falls below some 
threshold value (Rosenberg, 1960). Once the dis- 
comfort is reduced, evaluations may return to 
their original intensity and direction. If reevalu- 
ations are this short-lived, one would expect 
short-term postdecision behavior'to be more con- 
sistent with postdecision cognitions than with 
predecision cognitions. However, actions further 
removed in time from the decision should follow 
more directly from cognitions present before the 
decision was made. 

Results consistent with this hypothesized se- 
quence have been reported by Walster (1964). 
She found that as time passed after a decision 
was made, the amount of reevaluation decreased 
rather than increased. Ninety minutes after a 
recruit had chosen one of two military jobs, his 
evaluations of the rejected and chosen alterna- 
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tives were almost identical with the initial evalu- 
ations instead of being spread further apart. 
Walster explains this unexpected finding as due to 
the limited amount of new information that the 
isolated subject could muster in support of his 
decision, He reduced as much dissonance as he 
could without the help of others. As the recruit 
concentrated on the remaining dissonance and 
was unable to resolve it, he experienced increased 
pressures to reverse the decision, a pressure that 
is reflected in lessened enhancement and depreci- 
ation of the alternatives. However, it is possible 
that the person felt minimal rather than maximal 
discomfort. The earlier reevaluation may have 
been sufficient to reduce the discomfort associ- 
ated with the decision, With the abatement of 
tension it became less critical for the reevalua- 
tions to persist. It is also possible that reevalu- 
ations in the Walster study were fleeting because 
subjects did not have any opportunity to act on 
behalf of their revised beliefs. Validation of the 
beliefs by consistent action was not possible, and, 
with insufficient support, the changes may have 
receded, 

_ That dissonance resolutions may be enduring 

and prescriptive is suggested by Freedman (1965). 
He found that children who resisted the tempta- 
tion to play with an attractive toy for a mild 
threat were significantly less likely to play with 
the toy several weeks later than were children 
who resisted because of strong threats, Presum- 
ably a mild threat placed more pressure on the 
child to justify his avoidance of the attractive 
toy. While it is unclear precisely what justifica- 
tion was used—the toy was not devalued more 
under mild threat—it is clear that the changes 
in belief made to justify avoidance were stable 
and that subsequent actions served to validate 
these realignments, 

Even if cognitive realignments and action co- 
incide more closely after a decision, this does not 
Provide unequivocal support for a dissonance 
interpretation. Kelman (1962) and Deutsch, 
Krauss, and Rosenau (1962) suggest that be. 
havioral change is likely after a decision, not 
necessarily to reduce dissonance, but rather be- 
cause the “action may provide the occasion for 
the occurrence. of new experiences in relation to 
the object [Kelman, 1962, p. 86]” or “the ac- 
tivities in pursuit of the chosen alternative . . , 
Change the situation in such a way that new ad- 
ditional consequences become associated with the 
chosen and nonchosen alternatives [Deutsch et 
al, 1962, p. 17]." These explanations imply that 
discrepant acts are apt to be followed by open- 
ness to new experiences rather than felt pres- 
Sures to support the decisions, Furthermore, en- 
hancement or deprecation are not regarded as 
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necessary antecedents of these behaviors. Regard- 
less of the mechanism, there is agreement that 
postdecision acts serve to stabilize the decision, 
whether by design or by "accident." 

It is clear that data and propositions about 
actions that follow irreversible choices among 
nonoverlapping alternatives are inconclusive, The 
present experiment examines the fate of a com- 
mon set of attributes as they are associated with 
two separate decisions, Interest lies in the re- 
evaluation of the attributes after the initial 
decision and the degree to which a second deci- 
Sion is influenced by this reevaluation. The ques- 
tion of interest is, essentially: In what ways does 
the resolution of one dissonant situation con- 
Strain the actions associated with the arousal 
and reduction of dissonance in a second-choice 
situation? 


METHOD 
Procedure 


Subjects were 96 college juniors and seniors from 
an industrial management curriculum meeting at 
their regular class times. Each subject was asked to 
imagine that he was the president of a medium-sized 
manufacturing and engineering company which had 
an opening for a vice-president in charge of produc- 
tion. Each subject rated the importance of the-fol- 
lowing eight traits as criteria for selecting persons 
to fill the opening: leadership, experience, educa- 
tion, age, technical skill, sociability, intelligence, 
self-confidence. The traits were rated on an 11-point 
scale ranging from 0 (completely unimportant) to 
10 (extremely important). Each subject was then 
given descriptions of two candidates for the job 
and asked to choose one of them as the person best 
suited for the job. The descriptions were constructed 
so that four of the traits were highly characteristic 
of one candidate and the other four were highly 
characteristic of the other candidate. 

The following description of a candidate who is 
high on intelligence, self-confidence, sociability, and 
education, but low on leadership, experience, age, 
and technical skill is presented in the same format 
as it was received by the subjects. 


REPORT FROM PERSONNEL MANAGER 


CANDIDATE: David Jones 

Age, 62 Married, 3 children 

EDUCATION: MA in Business Administration 

EXPERIENCE: Professor of Business Adminis- 
tration for last 25 years. Has done some indus- 
trial consulting. 

PERSONALITY ASSESSMENT: Mr. Jones is 
highly intelligent. He scored in the upper 15% 
on the Executive Intelligence Test (EIT). He 
has some knowledge of engineering concepts 
which should be adequate, However, this is not 
one of his strong points, 

His leadership qualities are somewhat above 
average but certainly not outstanding. However, 


) 


FPE 


he has a high degree of self-confidence and 
feels that he can handle the job quite well. 

He seems very cordial and easy to get along 
with. He should have no problems concerning 
interpersonal relations. 


After one of the candidates was chosen, the eight 
traits were rated a second time. After a 7-minute 


- delay during which the subjects read two more 


descriptions a choice was again made between two 
different candidates, These last two descriptions of 
candidates were constructed so that two traits char- 


‘acteristic of the first candidate and two traits char- 


acteristic of the second candidate were combined to 
make up the set of four highly characteristic traits 
for each member of the second set of candidates. 
Thus each candidate in the second pair had two 
traits highly characteristic of the candidate who 
was chosen in the first decision and two traits 
highly characteristic of the rejected candidate. The 
second pair of candidates was characterized in this 
way to increase the likelihood that the first and 
second set of ratings would predict different choices 
in the second decision situation. Only if this condi- 
tion were approximated would it be possible to as- 


- sess whether pre- or postdecision ratings were more 


influential in subsequent behavior. 
After this second choice the traits were rated a 
third time, again with a different order of presenta- 
tion. Half the subjects received Candidates 1 and 2 
as their first pair and the other subjects received 
Candidates 3 and 4 as their first pair. 

The specific hypotheses to be tested were: 

1. After choosing a candidate, subjects will re- 


' evaluate the traits so that those traits characteristic 


of the chosen candidate will be more important 
and/or those traits characteristic of the rejected 
candidate will be rated as less important than in the 
original rating. 

2. On the second choice, if the original rating and 
the second rating favor different candidates, the 
second ratings will be used to determine the choice. 


Method of Analysis 


A set of importance ratings is said to predict the 
choice of a particular candidate if the sum of the 
ratings for the qualities in which that candidate 
excels is greater than the sum of the ratings of the 
qualities in which the other candidate excels. In the 
case of the hypothetical subject described in Table 1 
the first ratings predict the choice of Brown over 
Green and Smith over Jones. The sum of the first 
ratings for Brown’s traits (ABCD) 9+8+5+4= 
26 is greater than the sum of the ratings of Green’s 
traits (EFGH) 10+0+7+6=23, and the sum of 
the ratings of Smith's traits (ABEF) 9+8+10+ 
0=27 is greater than the sum of the ratings of 
Jones’ traits (CDGH) 5+4+7+6=22. However, 
the second ratings make the opposite prediction for 
the Smith-Jones choice. That is, using only the sec- 
ond set of ratings, the sum of the ratings of Jones’ 
traits (CDGH) 8+9+6+5=28 is greater than 
the sum of the ratings of Smith’s traits (ABEF) 
10+9+7+0=26. 
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TABLE 1 
DATA FOR HYPOTHETICAL SUBJECT 


Impor- | Candidate usps 

East Rating ret Rating 
A 9 Brown 0 8 
B 8 Brown 9 7 
Cc 5 Brown 8 9 
D 4 Brown 9 10 
E 10 Green 7 5 
F 0 Green 0 0 
G 7 Green 6 8 
H 6 Green 5 7 


Note.—First choice = B, second choice = J. 


To assess the amount of criteria enhancement 
after the first decision, the first set of ratings is 
compared with the second set. In the example, the 
first ratings give Brown, the chosen candidate, an 
importance score of 26 and Green, the rejected 
candidate, a score of 23, a difference of only 3 im- 
portance scale points. The ratings made after the 
decision (Rating 2) give Brown a score of 10+ 
9+8+9=36 and Green a score of 7+0+6+5 
— 18, a difference favoring the chosen candidate, 
Brown, of 18 importance scale points. This indicates 
an overall enhancement effect of 15 scale points. 

Similarly, enhancement of the decision criteria in 
the second-choice situation may be evaluated by 
comparing Rating 2 with Rating 3. In the example, 
Rating 2 gives Jones a score of 8+9+6+5=28 
and Smith 10+9+7+0= 26, while Rating 3 gives 
Jones a score of 10+9+8+7=34 and Smith a 
score of 8+7+5+0=20 for a net enhancement 
of 12 importance scale points. 


RESULTS 
Use of Importance Ratings 


In order to determine if subjects actually 
used their importance ratings to select the “best” 
candidate for the job, that is, the candidate 
whose characteristic traits were rated as most 
important, scores were computed for each subject 
for each of the two candidates by adding the 
importance ratings of the four traits character- 
istic of each candidate and then’ comparing the 
two scores. Seventy-two percent (69/96) of the 
subjects chose the candidate with the higher 
score, that is, chose the candidate who possessed 
the traits which the subject thought were most 
important. This is significantly different from a 
chance distribution at less than the .001 level 
(x? = 18.36, df = 1).? It should also be noted 
i statistical tests are two-tailed (Siegel, 1956, p. 

In all tests, observed frequencies were 
to the null hypothesis that p edu ane 
tion and data not supporting prediction should 
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that of the 27 subjects who did not choose the 
higher rated candidate 6 subjects had identical 
importance ratings for each candidate. Also the 
mean difference between candidate rating scores 
was only 2.63 units for the remaining 21 subjects 
who chose the lower rated candidate, while the 
mean difference between candidate rating scores 
was 4.88 units for the 69 subjects who chose the 
higher rated candidate. Thus, the "decision re- 
versals" were most likely to occur among those 
subjects who felt the candidates were quite 
similar in attractiveness, These 27 subjects, inci- 
dentally, would be assumed to have the largest 
magnitude of postdecision dissonance because 
the choice would require them to reject a highly 
attractive eandidate whose features would not be 
found in the candidate they chose. 

That dissonance increases as the candidates 
are rated more equal in attractiveness is shown 
by a comparison of the 32 subjects whose initial 
importance ratings showed the greatest difference 
between candidates and the 32 subjects whose 
initial importance ratings showed the least differ- 
ence between candidates, In the large difference 
group, when the traits were rerated after the 
decision, 15 subjects changed their ratings in a 
positive direction (ie., so that the characteristic 
traits of the chosen candidate were more im- 
portant and/or the characteristic traits of the 
rejected candidate were less important), and 17 
did not change or changed in a negative direction. 
This distribution is not significantly different 
from chance (x? —.124, df — 1). In the small 
difference group 27 subjects changed in a positive 
direction and 5 in a negative direction. This dis- 
tribution is significantly different from chance 
at the .001 level (x? = 15.12, df = 1). 

The dissonance interpretation of the differen- 
tial effects reported above is, however, not un- 
equivocal. The subjects in the large difference 
group may have experienced something of a ceil- 
ing effect which may have biased their results in 
a negative direction. Thus, the lack of positive 
change in this group could be due to this biasing 
effect rather than to the assumed lesser magni- 
tude of dissonance. However, bias should not 
have occurred in the small difference group since 


the importance ratings are free to move in either 
direction. 


Criterion Enhancement 


A positive change was scored for both evalu- 
ating a trait characteristic of the chosen candi- 
date as more important after the choice and for 


Se AN der id 
occur randomly with equal probabilities. Nonpre- 
dictable cases were treated as not supporting the 
prediction; therefore, all tests reported are biased in 
the conservative direction, 
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evaluating a trait characteristic of the rejected 
candidate as less important. It was found that 
65 of the 96 subjects changed in a positive di- 


rection, 6 subjects did not change, and 25 changed i Du 


in a negative direction. The null hypothesis that E 


changes, if any, should occur in a random fashion 
with probability of .5 of positive change and .5 
of no change or negative change can be re- 
jected at the .001 level (x? — 12.04, df — 1). 

Positive change was also found after the sec- 
ond choice. Sixty-six of the 96 subjects changed 
in a positive direction, 9 did not change, and 21 
changed in a negative direction. This distribution 
is significantly different from chance at the .001 
level (x?— 13.50, df — 1). Hypothesis 1 thus 
received strong support. 

Hypothesis 2 states that the trait ratings made 
after the initial decision will predict more ac- 
curately which candidate the subject will choose 
on his second decision than will the predecision 
ratings. Data relevant to this hypothesis can be 
generated in two different ways. First, we can 
examine those persons whose first and second 
ratings of importance favored different candi- 
dates in the second-choice situation. In this 
study, 39 subjects conformed to this require- 
ment. Considering only these 39 subjects, 33 
chose the candidate favored by the second ratings 
and 6 chose the candidate favored by the initial 
ratings. This finding is significantly different 


from chance at the .001 level (x? = 18.68, df= ' 


1). For the remaining 57 subjects, 40 chose the 
candidate that had received the higher evaluation 
from both the initial and second ratings, and 17 
chose the candidate who had received the lower 
evaluation from both ratings. 

Tf all 96 subjects are considered, additional 
support for Hypothesis 2 is obtained from the 
finding that when the first importance ratings are 
used as criteria for the second choice, 46 sub- 
jects chose the higher rated candidate, while 50 
chose the lower rated candidate, a distribution 
not significantly different from chance (x? — .16, 
df — 1). If the second set of ratings is considered, 
73 subjects chose the higher rated candidate and 
23 the lower rated candidate, a distribution dif- 
fering from chance at the .001 level of signifi- 
cance (X? = 26.04, df = 1). Thus it can be seen 
that the initial ratings do not predict the second 
choice at a better than chance level, but that the 
second or revised ratings make the correct pre- 


diction for 76% of the subjects a highly signifi- 
cant finding. 


Discussion 


There are several implications of the results 
that are interesting to note. For example, there 
is the suggestion in these findings that dissonance 


D 


‘served to catalyze the original trait ratings. It is 
probable that subjects, in making the original 
“zatings, did not have any explicit referent in 
"mind (this probably accounts for many of the 
decision reversals). Once the first decision was 
made, the original attribute ratings either sup- 
"ported or contradicted the decision. It is at this 
stage where "there is less emphasis on objectiv- 
_ ity and there is more partiality and bias in the 
| way in which the person views and evaluates the 
alternatives [Festinger, 1964, p. 155]." Clearly 
"the subject engaged in some reevaluations, but he 
also organized his subsequent acts and evaluations 
consistent with the reevaluations. The way in 
which the second decision was handled apparently 
was highly contingent on the events that oc- 
| curred after the first decision, An alternative 
explanation is that the criteria which a subject 
rated came to be understood by him only upon 
confrontation with the description of the candi- 
. date. Several data argue that this alternative is 
relatively untenable. First, such an explanation 
‘does not account for the rather high level at 
which the initial ratings predicted the first choice 
and the second ratings predicted the second 
choice. If subjects did not understand the cri- 
teria before they read the candidate descriptions, 
their importance ratings of these criteria could 
hardly be expected. to predict their choice. Sec- 
ond, if the reevaluation of criteria merely re- 
flects a greater understanding of the criteria, 
there should be less change after the second 
- choice than after the first. As noted earlier, this 
- did not occur; in fact, there was slightly more 
change following the second choice than there was 
after the first choice. 

It is somewhat surprising that the effects 
were as strong as they were since the subjects 
were essentially role playing and did not decide 

» among alternatives that would affect them per- 
sonally. The only reason for noting this is that 
importance is assumed to be a crucial variable 
in dissonance theory, and it is apparent that the 
present situation was of slight importance. What 

' probably is mediating the present results is the 
, variable of commitment. If only one candidate 
can fill the vice-presidency, it is clear that a 
choice between two persons unequivocally rejects 
. one candidate. When this condition obtains dis- 

sonance should be considerable since “significant 
dissonance reduction is obtained after the choice 
only if the person has definitely, by his decisions, 
given up the unchosen alternative [Festinger, 
1964, p. 156].” 

There is one aspect of the present experiment 

| that has considerable theoretical importance. In 


| 
: 
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most studies of postdecision behavior, it is ob- 
served that the specific attributes of the dis- 
sonant object are reevaluated. As a result, the 
chosen object itself is seen as more attractive 
than the unchosen objects. The present study 
suggests that attributes which transcend the 
particular incident are reordered to support a 
decision. It is not just that the chosen candidate 
is viewed as more attractive than the rejected 
candidate; it is the fact that the bases on which 
he was selected gain added value as predictors. 
The characteristics associated with the chosen 
candidate assume added importance as criterial 
attributes (Bruner, Goodnow, & Austin, 1956) 
for categorizing subsequent candidates. Using the 
definition of criterial attribute proposed by 
Bruner et al.—‘‘any attribute which when changed 
in value alters the likelihood of an object being 
categorized in a certain way [p. 31]"—it should 
be apparent that the present research suggests 
that an observer may alter significantly his cate- 
gorization of events if this alteration serves to 
reduce dissonance. In other words, the basis on 
which an event is discriminated, or said to exist, 
can be affected by the extent to which the actor's 
exposure to that event is consonant. 
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COMMENT ON WATERMAN AND FORD'S DISSONANCE S 
REDUCTION OR DIFFERENTIAL RECALL 


CLYDE HENDRICK 1 


University of Missouri 


An experiment by Waterman and Ford (1965) attempted to show that the 


finding of Aronson and Carlsmith (1962) that Ss preferred to confirm expect- 
ancies of failure rather than to succeed could be accounted for by differential 
recall of Ss in the different expectancy conditions. The present paper argues 
that differential recall cannot account for Aronson and Carlsmith’s results be- 
cause the low expectancy-high performance Ss in their experiment changed 
significantly more responses than the low expectancy-low performance Ss. 
However, in the Waterman-Ford experiment there was no difference between 
the 2 low-expectancy conditions in amount of recall. 


Waterman and Ford (1965) have suggested 
that the results of Aronson and Carlsmith’s 
(1962) performance-expectancy experiment did 
not demonstrate that persons tend to reinstate 
disconfirmed expectancies. Instead, Waterman 
and Ford argued that Aronson and Carlsmith’s 
results are due simply to differential recall and 
to the tendency of persons to retain a good 
score and to improve a poor one. They reasoned 
that subjects in the high-expectancy conditions 
who experienced repeated success would have 
developed a consistent response rule for making 
their choices because they were always rein- 
forced. The subjects in the low-expectancy con- 
ditions who experienced repeated failure would 
have been discouraged from developing any con- 
sistent rule for making their choices. Also, these 
latter subjects might have become less inter- 
ested in the task and paid less attention to it. 
From these assumptions they argued that sub- 
jects in the high-expectancy conditions should 
have had greater recall of their responses than 
subjects in the low-expectancy conditions. High- 
expectancy subjects who were given a high per- 
formance on the last trial would tend to recall 
their responses correctly and stay with their 
recalled responses. Low-expectancy subjects who 
were given a high performance on the last trial 
would tend to recall their responses incorrectly, 
and they would also attempt to stay with their 
recalled responses, However, since they could not 
recall as well, the low expectancy-high perform- 
ance subjects would appear to have c 
more of their responses than subjects in the 
high expectancy-high performance condition. 

High-expectancy subjects who were given a low 
performance on the last trial would tend to 

1The author would like to thank Judson Mills 


for his suggestions and helpful criticism in the 
preparation of this manuscript. 


706 


recall correctly and attempt to change their 
responses to achieve a better score. Low- 
expectancy subjects who were given a low per- 
formance on the last trial would tend to recall ; 
incorrectly and also attempt to change their re- 
sponses to achieve a better score. However, since 
low expectancy-low performance subjects could 
not recall as well, they might have changed fewer | 
of their responses than the high expectancy-low 
performance subjects because they would in- 
advertently stay with their initial responses. 
Since Waterman and Ford were able to show 
that subjects in high-expectancy conditions re- 
called to a significantly greater extent than sub- 
jects in low-expectancy conditions, they con- 
cluded that differential recall could account for 
Aronson and Carlsmith's results. 

Waterman and Ford's analysis focused on com- 
parisons between expectancy levels within a 
particular performance level. However, they 
failed to make the crucial comparison between 
the low expectancy-low performance and the’ 
low expectancy-high performance conditions. } 
Waterman and Ford’s results show no difference 
between these latter two conditions in amount of 
recall. This is as one would expect. Since sub- 
Jects in the two low-expectancy conditions did 
not differ in recall, from Waterman and Ford's 
Position there is no reason to expect a greater 
number of card changes in the low expectancy- 
high performance condition than in the low 
expectancy-low performance condition, If any- 
thing, their interpretation might predict more 
changes in the low expectancy-low performance 
condition than in the low expectancy-high per- 
formance condition because subjects in the for- 
mer condition would have a greater tendency to 
change their responses in order to improve their 
low score. However, when we examine Aronson 
and Carlsmith’s results, we find that the low 


PSN 


expectancy-high performance subjects changed 
significantly more responses than the low 

xpectancy-low performance subjects. This dif- 
ference was significant at the .01 level. Since 
subjects in the two low-expectancy conditions of 
» Waterman and Ford's experiment were equal in 
* the amount of recall, their results do not account 

for the significant difference between these two 
. conditions obtained by Aronson and Carlsmith. 
j Thus, although Waterman and Ford have pre- 
| 
| 


hi 


sented evidence that there are differences in 
recall in this type of experimental situation, they 
have not demonstrated that Aronson and Carl- 
smith's findings can be satisfactorily interpreted 
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in terms of differential recall and the attempt 
of subjects to obtain a good score and improve 
a poor one. 
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INDUCING BELIEF IN FALSE CONFESSIONS * 


DARYL J. BEM 


Carnegie Institute of Technology 


4 College Ss participated in individual experimental sessions disguised as research 


‘ 
» on lie detection. After crossing out specified words on a word list, each S was 
ê . trained to utter true statements in the presence of a "truth light" and false 
statements in the presence of a "lie light." He was then required to'state aloud 
F that he had previously crossed out certain words and had not crossed out 
others. Half of these “confessions” were false, and each was made in the 


presence of 1 of the 2 lights. As predicted, false confessions in the truth light 
produced more subsequent errors of recall and less confidence in recall accuracy 


than either false confessions in the lie light or no confession at all. 


An individual's beliefs and attitudes can be 
manipulated by inducing him to role play, deliver 

a persuasive communication, or engage in any 

E behavior that would characteristically imply his 
endorsement of a particular set of beliefs 
. .(Brehm & Cohen, 1962; King & Janis, 1956; 
.* Scott, 1957, 1959). A recent experimental analy- 
sis of these phenomena demonstrates that an 
individual bases his subsequent beliefs and atti- 
E on such self-observed behaviors to the 


extent that these behaviors are emitted under 
‘circumstances that have in the past character- 
2 istically set the occasion for telling the truth. 
Conversely, such control over an individual's 
. beliefs and attitudes is vitiated to the extent that 
cues are present implying that the behavior is 
^ deceitful or, more generally, is being emitted for 
immediate specific reinforcement (Bem, 1965). 
The effectiveness of self-persuasion can thus be 
altered by many of the techniques typically used 
to manipulate the credibility of any persuasive 
" 1The laboratory facilities for this research were 
lk. provided by Harlan L. Lane of the University of 
. Michigan. 5 


e 


communicator. For example, just as a com- 
municator is more persuasive to others if he 
appears to be free from coercion or if he is 
known to be receiving no payment for his com- 
munication, so too, it is found that he is more 
likely to persuade himself under such circum- 
stances (Bem, 1965). In fact, it has been sug- 
gested that American prisoners of war in the 
Korean conflict came to believe in some of the 
false confessions they were induced to make 
partially because the threat of pumishment for 
noncompliance was not present (Brehm & Cohen, 
1962, pp. 286-298). 

The present experiment explores tkis conjec- 
ture indirectly by attempting to verify the pos- 
sibility that a false confession can effectively 
distort an individual’s recall of his past behavior 
if the confession is emitted in the presence of 
cues previously associated with telling the truth. 
The design also permits a test of the hypothesis 
that cues previously associated with lying can 
create self-disbelief in true confessions, leading 
again to distortions in recall of the actual be- 
havior. More generally, the experiment attempts 
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to extend to a new dependent variable the evi- 
dence that an individual’s beliefs and attitudes 
are often based on observations of his own overt 
behavior and its apparent controlling variables. 
Although support for this proposition is now 
available for beliefs about external events, atti- 
tudes of many kinds, and self-judgments of 
hunger and emotional states (Bem, 1965), it has 
not been demonstrated that an individual’s recall 
of his past behavior can be controlled by a 
manipulation of his current verbal behavior. 
Each subject in the present experiment per- 
forms a word task in which he crosses out some 
words and not others. In the subsequent experi- 
mental session, he is trained to utter true state- 
ments in the presence of a colored light that 
we shall call the "truth light" and to make false 
Statements in the presence of a second colored 
light that we shall call the “lie light." Each sub- 
ject is then required to state aloud that he had 
previously crossed out certain words and had 
not crossed oüt certain others. Half of these 
required "confessions" are false, half are true, 
and each one is made in the presence of one of 
the two lights. After each confession, the subject 
attempts to recall whether or not he had actually 
crossed out the word. The main prediction is 
that false confessions emitted in the presence 
of the truth light will produce more errors of 
recall than either false confessions emitted in 
the presence of the lie light or no confession 
at all. A secondary, complementary, prediction is 
that true confessions emitted in the presence of 
the lie light may also produce errors of recall, 
since the visual cue "tells" the subject that his 
statement is false. It will be noted that each 
subject is his own control, and that each subject 
provides a complete replication of the experiment. 


MrrHop? 


Six male and five female College students were 
hired for individual experimental sessions to “help 
us find out if certain aspects of the human voice 
can be used for Purposes of lie detection.” After 
being seated dt a desk containing a microphone and 
desk lamp in a small acoustically tiled recording 
room, the subject was handed a list of 100 common 
nouns and an alphabetical list containing 50 of the 
words. The subject was told: 


This is an experiment designed to see if certain 
aspects of the human voice can be used for pur- 
Poses of lie detection. You will be given a number 


? All the written stimulus materials and a detailed 
procedural description for this experiment are repro- 
duced in full in the laboratory manual by Lane 
and Bem (1965). The experiment, as adapted there 
for use in experimental psychology courses, has now 
been replicated many times by student experimenters, 


of things to say into the microphone, and I 
take various measurements on your voice as yo 
do this. First, however, I would like you ti 

complete a preliminary task. You are to draw ai 
line through each word on this word list thai 
also appears in this alphabetical guide. Go throuj 

the word list only once, at your own speed, read- 
ing each word in turn and then checking to see 
if it occurs in the alphabetical guide. 


After completing this task, the subject filled out 
an information form in order to “provide us with 
facts we can ask you about in testing lie detection.” 
This 50-item form contained such questions as 
“What is your major field of study?” “Are you 
generally favorable to sororities and fraternities?” 
“What brand of toothpaste do you use?” etc. After | 
obtaining the completed forms, the experimenter left 
the room, and all further communication with the 
subject was conducted with an intercom. The fol- 
lowing training procedure was then employed to 
establish two lights as discriminative stimuli that 
would indicate that verbal behavior in the presence 
of the one was truth telling and in the presence of 
the other, lying. The subject was told: ‘ 


I wil now ask you questions one at a time 
from the information form that you have just | 
filled out. After I ask you each question, the 
equipment will be turned on, automatically illumi- í 
nating one of two colored lights in the ceiling 
fixture, You should then answer the question into. 
the microphone. Whenever the green light is on, 
you are to answer the question truthfully; when- 
ever the amber light is on, you should make up 
an untrue answer to the question and speak it 
into the microphone as convincingly and as natu- 
tally as possible. Your answers should be com- 
plete statements, For example, I will ask “What 
is your first name?" If the green light goes on, 
then you would answer “My first name is Xj". 
giving your real first name, If the amber light. 
goes on, you would make up some other name. 
lí you make a mistake or do not answer with: 
a complete sentence, I will ask you the same 
question again. 


You will now make statements concerning some 
of the words you saw earlier When I ask you 
to state that you crossed out a particular word— 
for example, if I say “Did-rrer”—you should wait 
until the equipment is turned on and then make 
a statement of the form “I did cross out the 
word TREE." If I ask you to deny having crossed 
Out a word—“Did not-rREE"— you should sa; 
“I did not cross out the word TREE.” Do not 
begin your statement until the equipment has bee 


turned on as indicated by the two lights which 
wil continue to flash on and off in random 
sequence. 


The procedure then proceeded as described. Using 

a predetermined schedule, the experimenter an- 
' nounced a word and instructed the subject either 
to state that he had or that he had not crossed out 
the word previously. One of the two colored lights 
was then illuminated; the subject made his “confes- 
sion"; the colored light was turn off, and the white 
- desk lamp was turned back on. After each “confes- 
sion," the subject entered the word onto a sheet 
of paper, indicated whether he recalled crossing out 
the word or not crossing out the word previously, 
and marked how confident he was in the accuracy 
of his recall using the following scale: 1, not sure 
at all; 2, slightly sure; 3, moderately sure; 4, quite 
sure; 5, absolutely sure. 

Fifty words from the word list were employed, 
10 in each of the following conditions: false 
confession-truth light, false confession-lie light, true 
confession-truth light, true confession-lie light, con- 
trol (recall only; no confession). Half of the words 
in each condition had actually been crossed out; half 
had not been crossed out. A postexperimental ques- 
tionnaire assesed the subject's awareness of any 
effects of his confessions or the lights on his recall 
and checked on the success of the minor decep- 
tion employed. Finally, each subject was paid for 
his participation and told the true purpose of 
the experiment. 

This procedure, then, assessed the control of recall 


" « exercised by overt verbal statements emitted in the 


presence of two discriminative stimuli, one of which 
had a history of pairing with true responses, the 
other with false responses. 


RESULTS AND DISCUSSION 


The major predictions are that false state- 
ménts emitted in the presence of the truth light 
will produce more errors of recall than either 
false statements emitted in the presence of the 
lie light or no statement at all. The first column 
of Table 1 compares the number of recall errors 
made in these two light conditions with each 
other and with recall errors for the 10 control 
words which the subjects were simply asked to 
, recall. One-sample £ tests based on difference 

. Scores for each subject test these one-tailed 

hypotheses. 

, tis seen that the hypotheses receive strong 
` support. False statements made in the presence 
of the truth light lead to significantly more recall 
errors than either a false statement in the pres- 
ence of the lie light or no confession at all. The 
consistency of the effect is revealed by the fact 
that none of the 11 subjects made more recall 
errors in the lie-light condition than in the 
truth-light condition (p < .001 by a one-tailed 
“sign test), and only 2 of the subjects made 
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TABLE 1 


RECALL ACCURACY AND CONFIDENCE RATINGS 
FOLLOWING FALSE CONFESSIONS 


(N = 11) 
Mean number of Mean ratings of 
Condition recall errors confidence 
(10 trials) (range: 1-5) 
"Truth light (A) 3.82 3.21 
Lie light (B) 1.82 3.46 
Control (C) 246 3.64 
t 1 
A versus B Idee! 1.86* 
A versus C 2.02* 31035 
B versus C .89 «1.07 
*p <.05, 
*»* p <.01. 
*** p < .0005. 


more errors in the control condition than in the 
truth-light condition (p = .033)+ 

After each trial, subjects rated their confidence 
in their recall accuracy on a 5-point scale, where 
a rating of 5 indicated absolute certainty. The 
second column of Table 1 displays these data 
for the three conditions. In general the con- 
clusions are parallel.to those yielded by the 
recall data: A false statement emitted in the 
presence of the truth light leads subjects to have 
decreased confidence in the accuracy of their 
recall, a judgment of their own behavior, it will 
be noted, that is accurate. 

For 20 words, subjects were required to emit 
overt statements that were actually correct, 10 
“true confessions” in each light condition. For 
the 10 words in the lie-light condition—where 
the light “contradicts” the validity of the state- 
ment—subjects made an average of 3.82 recall 
errors, a frequency equal to that found in the 
truth-light condition for false statements. For the 
10 words in the truth-light condition—where the 
light “confirms” the correctness of their state- 
ment—subjects made an average of 2.36 recall 
errors. This difference between .the two light 
conditions is significant (t= 1.90, p < .05, one- 
tailed), There is, then, some evidence that cues 
that have previously set the occasion for false- 
hood can raise doubts in the communicator 
himself about the validity of true statements he 
has uttered. 

A few of the subjects indicated on the post- 
experimental questionnaire that they felt that the 
confessions and lights may have impaired their 
ability to recall correctly. Only one subject, 
however, suspected any systematic relation be- 
tween the lights and the truth of the confession. 
She commented that “at the end I realized that 
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the amber light was on when I was telling the 
truth and the green when I was not." This is 
not correct, of course, since true and false state- 
ments in the two conditions were exactly coun- 
terbalanced. Her comment is, however, another 
datum indicating that the experimental treat- 
ments did, in fact, distort recall in the predicted 
directions. It would appear that any “awareness” 
in the present experiment is part of the outcome, 
not a cause of the experimental effects (cf. Bem, 
1965). 

In this experiment, the controlling manipula- 
tions are much weaker and the dependent vari- 
able much simpler than the conditions and be- 
haviors involved in the brainwashing of prisoners 
of war, The present study was designed not to 
replicate such conditions, but to provide an 
existence proof for a phenomenon presumed to 
operate within them: the possibility that false 
Statements can distort an individual's recall of 
his past behavior as a function of the credibility 
cues present at the time these statements are 
emitted, More generally, the positive results of 
the present experiment extend the evidence for 
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the proposition that an individual's beliefs and 
attitudes are often based on observations of his 


own overt behavior and its apparent controlling « 


variables. 
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MODIFICATION OF PSYCHOPHYSICAL JUDGMENTS AS A 
METHOD OF REDUCING DISSONANCE + 


PAUL R. WILSON ? anp PAUL N. RUSSELL 
University of Canterbury, Christchurch, New Zealand 


Dissonance reduction was measured in a situation requiring psychophysical 
judgments, 60 Ss estimated the height they lifted a heavy and a light weight 
which were both lifted the same vertical distance. Dissonance was aroused in 
20 Ss by rewarding them little money for lifting a heavy weight and relatively 


more money for a considerably lighter weight. A further 20 Ss received reward 
in proportion to the weight lifted while the remaining 20 Ss were not re- 
warded, It was hypothesized that Ss who received reward disproportionate to 
weight lifted would reduce dissonance by underestimating the distance they 
Hid the heavy weight relative to the light. Results support the hypothesis. 


Experimental evidence offered in support of 
the theory of cognitive dissonance has been 
critically evaluated by Chapanis and Chapanis 
(1964). One inadequacy of research in this area, 
they suggest, is the overcomplexity of experi- 
mental manipulations. The present investigation 
sought to test deductions from dissonance theory 
in the simplest possible experimental setting, and 

1 This research was supported by a grant from the 
University of Canterbury made available to R. A. M. 
Gregson. 

2 Now at the Australian National University. 


where a minimum of measurement assumptions 


was necessary. In the experiment subjects were ` 


required to estimate the vertical height they 
lifted a heavy and a light weight. Both weights 
were in fact lifted to the same height, It was 
predicted from dissonance theory that subjects 


would underestimate the height they lifted a | 


heavy weight when given little remuneration rela- 
tive to the height they lifted a light weight when 
given greater remuneration, The question is: Do 


subjects modify Psychophysical judgments when .' 


attempting to reduce dissonance? 


3 METHOD 
E Subjects 


7 Thirty male and 30 female students, waiting to 
^ enroll for introductory courses in psychology or 
sociology at the University of Canterbury, served 
as subjects. All were right-handed. 


Apparatus 


The apparatus consisted of a box 3 X 2 feet and 2 
feet high with two handles 1 foot apart on the 
4 top, and with a smooth strip of wood running along 
its top front edge. One of the handles was connected 
by a rope to a 7-pound weight while the other was 
connected by a rope to a weight weighing 1 pound. 
Both handles could be pulled to a height of exactly 
18 inches. When the subject was seated in front of 
the apparatus, the handles were at arm’s length 
which made a lift of 7 pounds quite difficult. 


Procedure 


A total of 60 subjects were assigned randomly to 
three groups each consisting of 10 males and 10 
females. Group A received 1 shilling (approximately 
14 cents) for lifting a 1-pound weight and a penny 
(approximately 1.2 cents) for lifting a 7-pound 
weight while Group B was given 1 shilling for lift- 
- ing the heavy weight and 1 penny for the light 
weight. Group C received no remuneration. The 
subjects’ task in each group is best explained by the 
^ instructions given to them. In all three groups, each 
( subject, upon entering the experimental room, was 
s, asked to sit down in front of the apparatus and read 
the following typewritten instructions handed to him. 


The experiment you are going to do is one on 
motor control. On the top of the box in front of 
you there are 2 handles. You have to pull with 
your left hand each of the handles in turn. When 
you hear the word "start" put on the goggles 
„provided, which act as a blindfold, When you hear 
the word "left" pull the handle on the lefthand 
side of the box as far as it will go. When you 
hear the word "right" pull the handle on the 
righthand side of the box as far as it will go. Do 
not attempt to pull the handles too suddenly. 

After you have pulled each handle put it down 
and partly take off your goggles so that you can 
see. You will then receive a sum of money, which 
you can keep, so put the money in your pocket. 
Put your goggles back on and then put your right 
hand on the edge of the smooth piece of wood 
which runs across the front of the box, Run your 
hand along this piece of wood as far as you think 
you lifted the handle. Indicate with your fore- 
finger your judgment. 

The experimenter will tell you when you have 
finished the experiment. The experiment will only 
take about 5 minutes. When you have read and 
fully understood these instructions say "ready." 
Then listen for the word "start." 


. Because Group C received no remuneration their 
instructions were slightly modified by omitting 
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“, . . and partly take off your goggles. . . .Put your 
goggles back on . . .” from the above instructions. 

Five trials were run for each subject; that is, every 
subject lifted each weight five times. It was ex- 
pected that the disproportionate reward-effort rela- 
tionship in Group A would create dissonance within 
the subjects. These subjects could reduce dissonance 
by underestimating the height they lifted the heavy 
weight relative to the light. In Group B reward was 
in proportion to effort so no dissonance should be 
created, while Group C received no reward so no 
reward-effort dissonant cognitions should exist. 

In all three conditions goggles were worn through- 
out the experiment to stop the subjects obtaining 
visual cues when making estimations on the smooth 
strip of wood. To reduce kinesthetic cues the sub- 
jects alternated the end of the strip of wood from 
which they began to make their estimations, In all 
three conditions the order of lifting weights was 
reversed; that is, in Conditions A, B, and C half 
the subjects lifted the light weight before the heavy 
while the remaining half lifted the heavy before the 
light. 

e 


RESULTS 


Each subject made five estimations of the 
vertical distance he lifted each weight. The 
difference in these estimations was found for 
each subject on every trial by subtracting the 4 
height-lifted estimation for the heavy weight 
from that for the light weight. All measurements 
of height estimations are expressed as the 
number of .25-inch units, 


TABLE 1 
ANALYSIS OF VARIANCE OF DIFFERENCES BETWEEN 


Heiout-Lirtep ESTIMATIONS ror HEAVY 
AND Licnt Wricnts 


Source df MS F 
Between subjects 59 
Reward conditions (A) 2| 11469 | 504** 
Sex (B) 1| 693.1 3,04 
Order (C) 1| 3040) 13 
AXB 2 $5 | <1 
AXC 2| 1116 | <d 
BXC 1| '1307 | <1 
AXBXC 2| 2088 1.18 
Subjects within groups 48| 227.7 
Within subjects 240 

Trials (D) 4| 1905 | 1.90 
AXD 8| 93.1 | <1 
BXD 4| 2109 | 21 
CXD 4| 1725 1.72 
AXBXD 8| 173.6 1.73 
BXCXD 4| 246.6 2,45* 
AXCXD 8 65.4 | <1 
AXBXCXD 8| 1794 1.79 
D X Subjects within groups| 192 | 100.4 
* 05. 
Er 2s 
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TABLE 2 


MEAN DIFFERENCE BETWEEN HEAVY AND LIGHT 
Weicur HEICHT-LIFTED ESTIMATIONS (HEAVY 
WEIGHT HEIGHT ESTIMATION SUBTRACTED 
FROM Licut WEIGHT ESTIMATION) 


M Iber of 

EDGE 
Reward disproportionate to effort 5.56 
Reward in proportion to effort 0.91 
No reward —1.08 


2 

"These differences were treated by a four-way 
analysis of variance with repeated observations 
on one fattor (trials), a summary of which is 
given in Table 1 (Winer, 1962, pp. 337-353). The 
results under the three different reward condi- 
tions are significantly different beyond the 2.5% 
level. 

Table 2 gives mean differences in estimations 
for the three reward groups. The mean difference 
for Group A is larger than that for Groups B 
and C. A positive difference means that the esti- 
mated height lifted for the light weight was 
greater than that for the heavy weight, Absence 
of a significant Reward Conditions X Trials inter- 
action and Reward Conditions X Order interac- 
tion indicate that the Group A subjects’ differ- 
ences in height-lifted estimations for the two 
weights are greater than those for Groups B 
and C irrespective of trials and order. 

The only other significant F is the Sex X Order 
X Trials interaction which is significant at the 
05 level. A graphing of this showed that female 
differences in light and heavy weight height- 
lifted estimations are slightly greater than the 
Corresponding male values for both the heavy- 
light and light-heavy lifting orders on the first 
three trials, Thereafter the female light-heavy 
and heavy-light orders diverge with the heavy- 
light order showing the light weight height-lifted 
estimation to be considerably greater. Differences 
in male height-lifted estimations for the two 
weights are less variable and fluctuate about the 
zero difference level for both orders on all trials, 
This interaction does not concern differences in 
the three reward conditions and is thus irrele- 
vant to deductions from dissonance theory. No 
explanation for it is thought necessary here, 
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: -DISCUSSION - De 
It appears that dissonance was aroused in one ,, ^ 
group of subjects (Group A) by rewarding them » 
with more money for lifting a light weight than 4° 
a heavy weight they lifted the same height. We", 
hypothesized that Group A would reduce dis- i 
sonance by underestimating the hdight they lifted `} 
the heavy weight relative to the height they: . 
lifted the light weight. In other words, they : 
would reduce dissonance by behaving as though , 
reward in proportion to effort and no reward. 

However, many subjects spontaneously re- 


they did not have to pull the heavy weight as 
ported that they lifted the two weights the same . | 


far as they had to pull the light weight and 
thus justifying to themselves, partially at least, 
the disproportionate reward-effort relationship. 
Results show that this underestimation is signifi- 
cantly greater for Group A than for the Control 
Groups B and C who, respectively, received 
height. Furthermore, they said they could not “ 
see what effect the money had on their estima- 
tions. Despite this, reward has clearly affected ] 
their estimatións. Dissonance reduction appears 
to operate without the subjects’ being aware of y. 
its occurrence. ; 
The order in which the heights were lifted 


did not reduce the dissonance effect nor was, 
there any lessening of dissonance over successive } }./ 
trials. The dissonance may disappear if more. 
than five trials are given, for in time subjects | 
would come to expect a disproportionate reward f 
situation or to learn that one gets little money 
for a lot of effort in this particular setting. ; 
To conclude, this study shows that psycho- ` 
physical judgments can be modified when sub“ 
jects attempt to reduce dissonance. It also sug- . / 
gests that experiments involving psychophysical Y 
judgments can provide a useful setting for some | 
social psychological research. The main advan; 
tage of this type of situation appears to lie in © 
the ease of measurement and the degree to which 
relevant variables can be isolated and controlled. 
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